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SOLIS  OF  THE  PROBLEMS  INVOLVED  IN  MEASURING 
THE  OUTCOMES  OF  BIBLE  TEACHING 


There  has  been  a time  in  the  history  of  the  Protestant 
church  when  Bible  teaching  was  the  only  form  of  religious 
instruction.  The  Bible  was  at  the  center  of  the  curriculum. 

Our  schools  were  Bible  Schools  instead  of  schools  of  Religion. 
At  the  present  time  there  is  a decided  swing,  in  our  more 
progressive  schools,  in  the  other  direction.  The  child  is 
being  made  the  center  of  the  curriculum  and  the  materials 
are  selected  to  effect  the  desired  changes  which  we  want  to 
take  place  in  the  child.  With  this  new  principle  becoming 
dominant  the  question  arises  as  to  what  place  Bible  teaching 
will  have  in  the  curricula.  Surely  it  will  be  useable  only 
in  so  far  as  its  teaching  can  be  expected  to  result  in  the 
desired  outcomes  or  changes.  We  are  faced, then,  with  the 
problems  involved  in  measuring  the  outcomes  of  Bible  teaching, 
as  well  as  that  of  all  other  teaching  in  our  curricula,  if 
we  are  to  know  with  certainty  what  materials  and  methods 
will  accomplish  the  desired  end. 

In  order  to  justify  these  assumptions,  it  may  be  necessary 
t^o  show  that  any  teaching  of  worth  is  that  which  has  a purpose 

VTV.iferVvr.s 

„of  objectives.  There  are  many  definitions  of  education  but  one 
of  the  most  significant  is  summed  up  in  the  phrase,  "desirable 
changes".  "Learning  is  essentially  a matter  of  changes  in 
the  abilities  of  the  learner.  The  pupil  may  increase  the 
number  of  facts  he  can  repeat  from  memory,  and  he  may  increase 
the  speed  with  which  he  is  able  to  add  small  numbers,  or  he 
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may  improve  his  skill  in  sawing  a board.  The  learning  may  con- 
sist in  changing  his  tastes  that  he  comes  to  like  violin  music 
more  than  the  noise  of  a bass  drum.  But  in  any  case  of  learn- 
ing the  essential  element  is  a change  from  one  state  or  con- 
dition to  another.  To  prove  that  learning  has  taken  place#  it 
is  necessary  to  compare  the  learneras  ability  after  being 
taught.  The  amount  of  the  difference  between  the  ability  be- 
fore and  the  ability  after  being  taught  is  the  most  reliable 
index  of  the  amount  of  learning  that  has  occurred.  The  amount 
of  learning  is  proportional  to  the  amount  of  change  in  ability 
If  we  expect  teaching  to  result  in  some  kind  of  desirable 
outcomes#  we  must  see  the  end  from  the  beginning,  that  is#  we 
must  know  what  outcomes  we  are  striving  for  before  we  can 
measure  our  success  in  reaching  them.  The  first  problem  of 
this  paper  i3  concerned  with  setting#  defining#  and  just- 
ifying objectives  of  Bible  teaching. 

The  rest  of  the  problems  discussed  in  this  paper 
are  concerned  more  immediately  with  measurement  as  it 
applies  to  Bible  teaching.  Some  of  them  are  merely  enumer- 
ated and  others  are  discussed  more  fully.  There  is  no 
attempt  made  to  solve  these  problems.  The  procedure  to  be 
followed  in  discussing  any  problem  is  determined  by  the 
fact  that  this  paper  is  being  written  from  the  viewpoint 
of  the  Bible  teacher  himself#  the  man  or  woman  who  teaches 
in  the  Church  School  or  college  and  not  from  the  viewpoint 


1.  Trabue#  M.R.  "Measuring  the  Results  in  Education" #P. 18,19 
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of  the  technical  test-maker.  For  example,  the  problem  of 
convincing  these  teachers  of  the  need  of  measurement  is 
treated  more  fully  than  the  problems  involved  in  construct- 
ing standard  tests.  This  same  criterion  will  be  applied  to 
the  whole  procedure  of  this  paper. 

Section  I,  The  Problem  of  Determining  and  Defining  the 
Objedtives  of  Bible  Teaching 

Objectives  may  be  comprehensive  or  they  may  be  specific. 
A comprehensive  objective  would  be  stated  in  general  terms  of 

the  whole  life,  that  is,  it  would  look  forward  to  the  ultimate 
result  of  one’s  whole  education.  Specif jc  objectives  would  be 
in  terms  of  steps  along  the  way;  they  are  the  desired  outcomes 
that  may  be  expected  to  appear  in  the  life  and  conduct  of 
persons  as  they  grow  toward  the  more  comprehensive  objective. 
It  is  not  expected  that  Bible  teaching  will  be  the  whole  of 
one’s  education,  or  even  of  his  religious  education,  so  the 
objectives  of  Bible  teaching  may  have  a specificness  which 
hopes  for  accomplishment  in  the  growing  life  as  well  as 
looking  forward  to  an  ultimate  result.  In  other  words,  it 
is  assumed  that  there  may  be  changes  or  outcomes,  as  the 
result  of  teaching,  in  the  experience  of  any  group  of  normal 
students.  Some  objectives  may  be  more  immediate  than  others 
in  that  they  may  be  sooner  realized  in  experience.  All  spec- 
ific and  immediate  objectives  must  be  determined  in  relation 
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to  more  comprehensive  objectives,  or  to  the  great  objectives 
of  a life.  Christian  philosophy,  Christian  ethics,  the  needs 
of  man,  and  the  nature  of  society  determine  the  nature  of 
objectives  and  what  is  deemed  desirable  in  the  way  of  exper- 
ience. As  objectives  of  Bible  teaching  are  presented  here, 
they  will  be  determined  and  related  to  what  is  considered  de- 
sirable Christian  experience. 

Almost  every  curriculum  of  religious  education  is  pre- 
ceded by  a statement  of  objectives.  It  is  illuminating  to 
abstract  and  consider  these  objectives  which  are  based  oil, 
or  concerned  with,  Bible  teaching.  The  International  System 
of  the  Closely  Graded  Church  School  Courses  is  one  of  the 
most  widely  used  systems  in  the  Protestant  Church  School 
today.  A survey  sheet  of  the  courses  with  objectives  and 
subject  matter  reveal  the  following  facts:  Stories  from  the 
Bible  are  used  throughout  all  the  grades  along  with  other 
material,  and  part  of  every  objective  is  based  on  Bible 
teaching.  Even  the  Primary  Department  objective  is  "To 
provide  such  opportunities  for  Christian  living  that  the 
child’s  religious  experiences  will  be  enriched;  his  concept 
of  Cod  as  revealed  by  Jesus  definitely  expanded...”  This 
"concept  of  Cod  as  revealed  by  Jesus”  is  based  on  Bible 
material.  The  Junior  objective  is  ”to  help  the  child 

to  become  a doer  of  the  Word,  and  to  lead  him  into  conscious 
loyalty  to  Jesus  Christ”. 


Part  of  the  material  or  subject 
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matter  suggested  for  reaching  this  objective  is  " Bible  Stories 
and  incidents  from  the  life  of  Jesus...  a study  of  the  example 
of  Jesus...  and  the  teachings  of  Jesus."  The  Intermediate 
objective  assumes  general  acquaintance  with  the  Bible,  with 
special  emphasis  in  the  third  year  on  "The  Life  and  Teachings 
of  Jesus".  The  objective  is  stated  thus:  "To  lead  the  pupils 
to  know  Jesus,  and  to  desire  to  carry  out  his  teachings  in 
their  lives."  Naturally  much  of  the  material  is  from  "The 
historical  life  of  Jesus. . (and) . .narratives  from  the  life  of 
Jesus  with  abundant  scripture  references."  The  Senior  courses 
also  assume  a Biblical  background  and  use  "Problems  of  every- 
day life  with  both  Old  and  New  Testament  as  background."  So 
throughout  this  whole  series  Bible  material  is  found  to  have 
a prominant  part  in  the  curriculum.  It  does  not  seem  necessary 
in  this  paper  to  argue  the  reasons  for  including  the  teaching 
of  the  Bible  in  a curriculum  of  religious  education,  since 
it  is  already  included  in  all  widely  used  systems.  Perhaps 
as  we  progress  and  see  what  results  may  legitimately  be  ex- 
pected as  a result  of  including  the  Bible  in  the  curriculum 
we  shall  have  arguments  enough  in  its  favor. 

One  of  the  most  cooperative  efforts  in  religious  .edu- 
cation is  that  undertaken  by  the  International  Council  of 
religious  'education.  In  1929,  they  published  a tentative 
statement  of  the  Objectives  of  ] eligious  1 ducation,  which 
is.  probably  the  best  and  most  comprehensive  v, J 
yet  for  Religious  education.  Of  course,  they  are  not 
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limited  to  the  objectives  of  Bible  teaching,  but  here  again, 
one  may  abstract  those  which  seem  to  be  based  on  the  Bible.  An 

abstraction  and  restatement  of  such  objectives  as  were  found 

1 • 

in  the  International  Council  bulletin  are  given  below.  The 
restatement  is  not  very  diffarent  but  stated  in  the  light 

of  measuring  the  outcomes  of  Bible  teaching,  since  the  meas- 
urement of  outcomes  must  be  determined  in  part  by  the  outcomes 

desired.  Through-out  the  rest  of  this  paper,  in  speaking  of 
outcomes  or  objectives  which  are  desired  as  a result  of  Bible 
teaching,  it  is  these  objectives  which  follow  that  are  in  the 
mind  of  the  writers 

1.  A knowledge,  understanding  and  appreciation  of  the  life, 
personality,  and  teachings  of  Jesus,  that  will  lead  to  a 
discovery  of  Him  as  Savior  and  Lord,  loyalty  to  Him  and  his 
cause,  and  a control  of  daily  life  and  conduct  in  terms  of 
Jesus1  ideals  for  life  and  conduct. 

This  objective  may  be  divided  up  into  three  phases: 

a.  Jesus  as  a personality 

b.  The  examples  and  teachings  of  Jesus- 

c.  The  cause  of  Jesus  - building  the  Kingdom  of  God. 

2.  A consciousness  of  God  as  a reality  in  human  experience 
and  a satisfactory  personal  relationship  to  him 

a.  through  the  growing  revelation  of  God  to  the 
Hebrews 

b.  through  Jesus*  interpretation  and  revelation 
of  Him. 

1.  The  International  Council  Of  Religious  Education,  "A 
Cooperative  Curriculum  Enterprise?  Pages  19-26 
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3.  An  understanding  and  appreciation  of  the  Bible  ( a know- 
ledge and  satisfying  view  of  the  Bible) 
a*  as  a book  of  Religion  and  a record  of  religious  ex- 
perience. 

b.  as  literature  - which  interprets  the  personalities 
and  teachings  of  the  Bible. 

c.  for  personal  counsel,  inspiration  and  guidance. 

As  these  objectives  stand  they  are  not  very  useable  to 
a teacher  for  they  are  too  general.  Learning  is  specific. 

A teacher  may  do  his  best  to  teach  a child  to  follow  the  ex- 
ample and  teachings  of  Jesus,  but  that  child  would  only  learn 

as  he  learned  these  examples  and  teachings  separately,  and  even 

them 

then  he  would  not  learn  >v  as  a whole  until  he  had  learned 

their 

many  of  *a  specific  applications.  Reverence  is  one  thing 
which  the  child  should  learn,  for  reverence  is  one  of  the 
teachings  of  Jesus.  One  situation  in  which  reverence  should 
be  an  element  of  conduct  is  in  the  church  worship  service. 

Many  times  boys  and  girls  are  told  to  be  reverent  in  the 
worship  service,  but  just  as  often  they  are  not  reverent, 
not  because  they  do  not  desire  to  follow  instruction  but 
because  they  do  not  know  how  to  apply  the  trait  to  the  spec- 
ific situation.  They  must  learn  that  to  be  reverent  in  church 
they  must  listen  attentively  to  the  service,  think  carefully 
about  the  service,  participate  in  it  so  far  as  they  can,  re- 
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frain  from  movements  that  distract  anyone’s  attention,  and  so 
on.  Many  times  even  further  analysis  is  necessary;  for  example, 
the  child  may  need  to  learn  what  movements  are  distracting, 
and  what  ones  are  necessary.  Reverence  is  necessary  in  other 
situations  and  at  other  times,  and  the  child  has  not  learned  to  act 
reverently  until  he  can  apply  the  trait  to  many  situations,  and 
each  situation  must  he  learned  separately  until  enough  have 
been  learned  to  guarantee  the  learning  of  the  common  elements 
which  apply  to  all  situations,  so  that  he  can  meet  new  sit- 
uations and  apply  the  trait  to  them. 

Given  the  objectives  stated  above,  one  will  readily 
recognize  that  there  are  different  kinds  of  learning  in- 
volved and  that  there  will,  therefore^e  different  kinds  of 
measurement  needed.  Each  of  these  major  objectives  assumes 
that  there  will  be  certain  knowledge  acquired.  For  example, 

V 

take  one  phase  of  the  first  objective  and  it  will  include 
knowledge  and  understanding  of  the  personality  and  life  of 
Jesus,  knowledge  and  understanding  of  his  life  and  teachings, 
and  a knowledge  and  understanding  of  his  cause.  Knowledge  is 
assumed  as  a desired  outcome  in  each  objective,  and  when  we 
get  ready  to  measure  these  outcomes,  we  must  be  prepared  to 
measure  the  knowledge  acquired.  This  means  that  there  will 
be  certain  facts  of  Biblical  information  which  are  desired* 
it  means  that  certain  religious  ideas  will  be  gained  and 
certain  skills  will  be  acquired  in  making  ethical  judgements 
and  formulating  courses  of  action.  Perhaps  this  first  phase 
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of  the  desired  objectives  can  be  called  the  knowledge-skill  out- 
comes. 

But  these  objectives  assume  more  than  knowledge  and  skills. 
They  assume  the  acquirement  of  some  such  dynamic  factors  as 
interests*  ideals,  attitudes,  appreciations,  and  prejudices. 
Notice  such  phrases  used  in  the  objectives  as  Appreciation  of" 
loyalty  to",  "satisfying  view  of",  etc.  which  assume  more 
than  mere  knowledge  and  skills.  A teacher  of  the  Bible  is  not 
satisfied  to  have  his  students  able  to  repeat  the  Bible,  or 
handle  skillfully  its  ideas,  but  he  desires  his  teaching  to 
result  in  dynamic,  motivating  ideals,  attitudes,  appreciations 
and  interests  which  will  tend  to  influence  his  life  and  con- 
duct because  it  has  a personal  meaning  for  him.  We  want 
acquire  ,ward 

pupils  to  /s.  ? attitudes  to^God  which  will  involve  reverence, 
cooperation,  etc.  We  want  them  to  have  right  attitudes  toward 
others  - attitudes  of  neighborliness,  love  and  kindness.  We 
want  them  to  take  proper  attitudes  to  themselves  as  child- 
ren of  God.  We  must  be  prepared  then,  to  measure  more  than 
knowledge,  for  after  all  the  measurement  of  an  amount  of 
knowledge  is  comparatively  easy;  we  must  be  prepared  to 
measure  attitudes  acquired.  In  the  description  of  Thurstone 
and  Chave's  experiment  in  "The  Measurement  of  Attitude", 

Dr.Chave  has  made  these  statements:"  The  more  important  con- 
cern of  Religious  Educators  today  is  to  measure  ..  how  far 
attitudes  and  values  that  express  the  religious  tendencies 
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considered  to  be  directed  toward  the  realization  of  the  highest 
good  for  the  individuals  themselves  and  for  the  society  of 
which  they  are  members  have  been  developed  in  individuals  and 
in  groups  of  persons.  These  attitudes  involve  tendencies 
toward  the  institutions  of  religion  - its  symbols,  its  lit- 
erature, its  expressed  doctrines,  its  concepts, ideals,  pro- 
grams, and  other  phases  of  religious  living.  The  attitudes 

taken  by  persons  indicate  the  values  discovered  in  their 

1. 

personal  and  social  religious  experience."  The  problem  of 

how  to  measure  these  complex  factors  will  arise  later  in  this 

0 

paper. 

But  our  objectives  assume  even  more  than  knowledge- skills, 
and  the  dynamic  factors  such  as  interests,  attitudes,  ideals 
and  appreciations.  They  assume  changes  in  behavior,  or  con- 
duct. Recall  this  statement  from  the  first  objective,  "A  con- 
trol of  daily  life  and  conduct  in  terms  of  Jesus1  ideals  for 
life  and  conduct."  Each  objective  has  its  conduct  phase. 
Knowledge  and  attitudes  act  as  motives  in  conduct  but  conduct 
goes  beyond  either.  If  we  follow  the  theory  of  Percival  M. 
Symonds,  we  may  think  of  "Human  conduct  as  a product  of 
natural  forces  in  much  the  same  way  as  is  the  rest  of  the 
physical  world,  which  is  being  so  effectively  understood 
and  controlled.  . . By  studying  all  possible  combinations 
of  stimulus  and  response  and  the  relations  of  these  com- 
binations to  conduct,  one  is  led  inevitably  to  the  conclusion 

1.  Thurstone  and  Chave,  "The  Measurement  of  Attitude"  p.IX 
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that,  after  all,  this  illusive  ideal,  character,  is  really  the 
organization  of  large  numbers  of  habits.  Such  a conclusion  re- 
moves the  suspicion  of  sentimentality  from  character  education 

and  makes  it  instead  a problem  for  scientific  educational  en- 

1. 

gineering."  For  purposes  of  measurement,  Hartshorne  and 

2 • 

May  ‘think  of  conduct  as  specific  acts  of  behavior,  but  they 
recognize  the  close  and  subtle  relations  between  behavior 
and  knowledge  and  attitudes.  Whether  we  conceive  of  con- 
duct as  do  these  men,  or  whether  we  hold rhat  it  is  largely 
controled  and  determined  by  the  ideals  and  attitudes  of  the 
subject,  yet  for  our  purposes,  it  is  the  specific  acts  org- 
anized into  a controled  conduct  in  harmony  with  Jesus’  ideal 
of  life,  which  we  hope  for  and  posit  as  a desired  outcome  of 
Bible  study. 

In  summary  of  this  first  problem  recall  that  the  teacher 
must  know  why  he  is  teaching  and  what  he  hopes  to  accomplish. 

We  have  discovered  that  learning  is  in  terms  of  changes  - 
that  we  should  expect  certain  useful  changes  as  a result  of 
teaching  and  that  these  useful  changes  may  be  expressed  in 
terms  of  desired  outcomes  or  objectives.  Then  we  have  seen 
that  learning  is  specific  and  that  a general  objective  is 
composed  of  many  subdivisions  • A teacher  cannot  reach  a 
general  objective  when  neglecting  its  component  elements, so 
neither  can  she  measure  general,  complex  qualities,  but  must 

1.  Symonds,  P.M.  The  Nature  of  Conduct"  pages  VII, VIII 

2.  Hartshorne  and  Hay,  "Studies  in  Deceit"  page  11 


i t t ;■  ■ . r t 


. 


- 

' 

; 

a 

a 

- ..  


• - 

, 


teach  and  then  define  or  re  asure  specific  changes  desired. 
Attempts  to  make  the  objectives  specific  showed  that  there 
were  three  kinds  of  changes  desired,  and  therefore,  three 
kinds  of  measurement  needed.  These  three  classifications  were 
concerned  with  (1)  the  acquisition  of  knowledge  and  skills, 
(2)  the  acquirement  of  dynamic  factors,  such  as  interests, 
ideals,  attitudes,  appreciations,  and  prejudices,  and  (3)  the 
ability  and  practise  of  self-control  in  conduct  according  to 
the  Jesus  ideal  of  life. 


II  The  Problem  of  Convincing  Educators  and  Administrators  of 
the  Value  and  Need  for  Measurements  in  Religious  Education  • 
Not  only  does  this  paper  deal  with  the  problems  in  de- 
fining and  justifying  objectives  or  desired  outcomes  in 
Bible  Study,  but  with  problems  involved  in  measuring  these 
desired  outcomes.  Many  people  do  not  see  the  need  for 
measurements  in  education  , and  especially  in  religious 
education,  so  the  second  problem  is  to  convince  our  relig- 
ious educators  of  the  need  of  measurement  by  giving  the 
reasons  for  the  use  of  measurement  in  religious  education. 

It  will  be  readily  seen  that  many  of  the  same  reasons  apply 
equally  in  the  fields  of  religious  education  and  public  school 
education.  Chester  A Gregory  in  "Fundamentals  of  Educational 
Measurement"  has  given  the  first  two  chapters  to  a dis- 
cussion of  the  reasons  which  I am  quoting  here: 
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(l).  Progress  is  conditioned  by  the  ability  to  measure.  Prof- 
essor Gregory  has  said  that  "Progress  in  our  civilization 
has  depended  very  largely  on  our  ability  to  measure.  James 
Watt,  for  instance,  could  not  make  a steam  engine  until  men 
were  able  to  make  measurements  so  exact  that  a cylindar  and 
a piston  could  be  built  that  were  steam  tight  and  yet  allowed 
free  play... It  is  no  longer  a matter  of  opinion  as  to  the 
strength  of  a steel  girder,  for  instance,  or  the  resistive 
power  of  a steel  rail.  The  scientist  new  speaks  with  authority 
along  these  lines.  Natural  science  has  made  its  gains  by 
substituting  facts  for  opinions,  and  units  of  Measure  for 
mere  guess  work#  ..  When  the  teaching  profession  enters  the 
stage  where  its  data  and  conclusions  can  be  presented  in 
quantitative  as  well  a,s  qualitative  terms,  it  has  entered 
upon  a most  important  stage  of  development.  Men  in  all  bus- 
iness and  professions  are  becoming  quantitative  thinkers. 
Education  is  not  an  exception.  Educators  are  seeking  to 
verify  and,  in  some  cases,  to  refute  the  established  be- 
liefs concerning  the  effects  of  educational  forces  upon 
human  nature.  Dogmatic  and  authoritative  control  of  educat- 
ion is  going  the  way  of  all  mere  authority  and  dogma  in 
human  affairs.  The  popular  guessing  contests  that  have 
been  going  on  in  education  as  to  which  processes  are  the 
best,  and  what  products  are  obtained  from  them,  are  giving 
way  to  experimentally  determined  facts.  It  means  that 
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education  is  emerging  from  among  the  vocations  and  taking  its 

1. 

place  among  the  prof essions. " 

What  was  true  in  the  educational  field  when  this  was 
written  is  beginning  to  be  true  in  religious  education.  Rel:g  ious 
educators  have  guessed  long  enough.  It  has  not  been  long  since 
(indeed  it  is  still  too  often  true)  that  religious  teachers  have 
guessed, or  supposed,  or  took  for  granted,  that  memorizing  Bible 
passages  was  the  process  which  would  produce  a Christian  char- 
acter; but  too  many  people  who  can  recite  scripture  are  pagan 
in  conduct  and  attitude.  Progress  in  religious  education  will 
come  about  when  we  are  able  to  measure  results  and  find  out  what 
the  different  processes  do  produce  and  how  much  of  it  they  produce 
It  is  as  we  learn  which  processes  produce  the  desired  results 
that  we  can  apply  these  processes  to  the  achievement  of  out  goals. 
First  of  all  then,  we  must  produce  instruments  which  will  measure 
the  results  of  our  teaching,  and  these  instruments  must  give 
scientific , accurate  data  which  guide  and  direct  teaching  processes 
(2)  Definite  aims  are  set  and  adhered  to  through  the  require- 
ments and  practise  of  measurement.  General  aims  have  been 
the  rule  in  education.  If  a teacher  sis  asked  the  value 
of  education,  or  her  purpose  in  teaching,  she  may  reply  that 
she  hopes  to  make  a Christian  citizen  out  of  Johhnie.  Bow 
given  this  general  aim,  how  will  she  proceed  to  do  this?  She 
may  teach  him  to  repeat  the  Ten  Comrnandmentd,  to  sing  the 


1.  Gregory,  C.A.  ’’Fundamentals  of  Educational  Measurement/" 
pages  5,6 


religious  songs,  or  to  pray  in  public,  and  Johnnie  may  be 
able  to  do  all  these  things  and  yet  be  a scoundrel,  a menace, 
and  of  no  use  to  other  people  or  his  country.  How  then,  is  she 
to  know  how  to  make  a Christian  citizen  out  of  him,  that  is, 
what  and  how  shall  she  teach  him  so  that  she  gets  the  desired 
results?  Her  aim  is  so  big,  so  indefinite,  so  illusive  that 
her  task  seems  a hopeless  one.  Professor  Bobbitt  has  well 
stated  the  present  situation  in  regard  to  educational  aims. 

He  says,  MWe  have  aimed  at  a vague  culture,  an  ill-defined 
discipline,  a nebulous  , harmonious  development  of  the  indiv- 
idual, an  indefinite  moral  character  building,  an  unparticu- 
larized social  efficiency,  or  often  enough,  nothing  more 

1. 

than  an  escape  from  a life  of  work."  Professor* s Bobbitt’s 
choice  of  adjectives  is  significant  - vague,  ill-defined, 
indefinite,  unparticularized  - for  this  is  just  the  situation 
in  religious  education.  This  is  the  situation  in  regard  to 
Bible  teaching.  The  aims  are  not  stated  at  all,  or  if  so, 
are  not  particularized.  Recall  the  discussion  in  the  first 
part  of  this  paper.  Even  when  enough  progress  has  been  made 
to  state  the  inclusive  aims  of  Bible  study,  they  are  yet  so 
general  that  teachers  are  not  able  to  profit  by  them.  They 
must  be  made  specific.  The  situation  is  almost  like  that  of 
the  little  boy  who  was  told  by  his  mother  to  keep  his 
clothes- closet  neat  and  in  order.  He  did  not  know  how  to 
make  it  heat  until  she  analyzed  neatness  bjt  telling  and 


1.  Bobbitt,  J.P.  "The  Curriculum"  page  41 
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showing  him  that  his  clothes  should  go  on  hangers,  his  shoes 
should  be  paired  and  placed  in  rows,  and  that  his  hats  and  caps 
belonged  on  the  shelves.  The  general  aim  of  neatness  could  not 
be  attained  until  it  was  broken  up  into  its  specific  subdiv- 
isions. Christian  Citizenship  is  a general  objective  and  must 
be  divided  up  into  its  component  elements  and  specific  object- 
ives before  it  can  be  taught  or  measured.  MTo  the  extent 
that  any  goal  of  education  is  intangible  it  is  worthless.  We 
want  to  be  able  to  answer  at  least  three  things  about  any 
goal:  (l)  what  is  the  worth  of  the  goal?  (2)  What  is  the 
location  of  the  goal?  (3)  Is  the  pupil  moving  toward  or  from 

the  goal?  Measurement  is  necessary  to  answer  each  of  these 

1. 

absolutely  vital  questions."  How  teachers  may  go  along 
indefinitely,  as  teachers  have,  hoping  by  some  good  fortune 
to  be  accomplishing  these  general  aims,  unless  there  is  some 
way  of  measuring  their  results  and  knowing  what  is  happening. 
But  let  a teacher  attempt  to  measure  her  results,  and  immed- 
iately she  becomes  aware,  not  only  of  her  successes  and  fail- 

- ■ - - * 

ures,  but  that  she  must  have  something  definite  to  measure,  - 
that  she  will  have  to  measure  the  various  elements  entering 
into  the  result.  This  will  force  her  to  analyze  her  object- 
ives into  sub-objectives  and  specific  aims  and  adhere  to  them 
in  her  teaching  in  order  to  measure  progress. 

For  illustration  of  this  argument  in  the  field  of 
Religious  education  and  Bible  teaching,  one  of  the  desired 

1.  McCall,  W.A.  "How  to  Measure  in  Education"  page  11. 


. 


• • 


’ 


, 

y 


outcomes  is  a satisfactory  personal  relationship  to  God.  This 
is  very  general  and  inclusive,  for  a satisfactory  and  per- 
sonal relationship  to  God  involves  one’s  relationships  to 
others,  to  the  world  about  him,  involves  all  his  activity, 
habits  and  thoughts,  and  in  fact,  all  his  life.  Even  to  ex- 
plain how  very  general  it  is,  is  to  subdivide  it  somewhat. 
Everyone,  however,  will  admit  that  it  is  a desirable  outcome 
of  Bible  study.  How  then,  is  it  to  be  achieved,  and  shall  I 
add,  measured?  Only  by  analyzing  it,  and  discovering  of  what 
it  is  composed.  This  means  several  sub-divisional  objectives 
such  as  theses 

a.  Satisfactory  communion  with  God. 

b.  Knowledge  and  understanding  of , and  right  attitude  to- 

ward God’s  ways  of  working,  his  purposes,  and  his 
relationships  to  men. 

c.  Cooperation  with  God. 

Then  if  one  attempts  to  measure  these  he  must  again 
analyze  and  subdivide.  Take  the  first  named.  Satisfactory 
communion  with  God,  and  it  will  involve  knowledge,  attitudes, 
and  conduct.  One  of  its  subheads  will  be  a right  attitude  to 
prayer  as  communion  with  God.  Then  this  attitude  may  be 
analyzed  still  further,  and  so  on  indefinitely,  until  the 
teacher  knows  very  definitely  what  is  included  in  a right 
attitude,  and  will  attempt  to  measure  these  definite  qualities 
and  to  determine  their  amount.  Suppose  she  does,  what  is 
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the  gain?  First  of  all,  she  will  know  definitely  what  to  aim 
for.  This  analysis  will  set  definite  aims,  and  make  for  de- 
finite teaching,  and  if  measurement  does  no  more,  it  will 
have  proved  its  value. 

(3)  Wastes  are  eliminated  through  the  requirements  and 
practise  of  measurement.  Economy  ih.  time  is  ever  a necessity 
in  this  scientific  age  which  makes  so  many  demands  that 
efficiency  not  only  becomes  the  watchword,  but  becomes  the 
test  of  continued  existence.  Progress  eliminates  a waste 
of  time,  and  since  ability  to  measure  makes  progress  poss- 
ible, measurement  saves  time.  For  lack  of  measurement 
which  would  yield  better  knowledge  of  pupils*  ability  to 
progress,  many  pupils  in  the  public  schools  have  been  forced 
to  mark  time,  that  is  waste  time,  when  they  were  capable  of 
making  faster  progress.  Measurement  would  help  to  classify 
pupils  with  others  of  the  same  or  like  abilities  and  a 
teacher  would  not  be  needlessly  wasting  the  time  of  one 
group  while  ministering  to  the  needs  of  another  whose  range 
of  ability  was  so  different  that  the  two  groups  require  dif- 
ferent treatment.  Measurement,  then,  would  help  in  class- 
ification so  that  pupil’s  time  would  be  saved  by  proper 
grading.  The  Church  School  with  its  unsatisfactory  grading 

needs  this  same  safeguard  as  does  the  public  school.  Lentz 

1/ 

ar&ues  that  classification  on  the  basis  of  morality  is 
essential  and  that  tests  of  character  will  make  such  class- 

1.  Lentz,  T.F  "An  Experimental  Method  for  the  Discovery 
And  Development  of  Tests  of  Character".  Page  4. 
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ification  possible.  He  writes,  "Today  we  have  in  the  same 
classes  children  of  various  degrees  and  kinds  of  moral  rect- 
itude. But  our  large  classes  make  necessary  much  the  same 
treatment  for  all  children  within  a class.  Thus,  with  both 
honest  and  dishonest  children  within  a class,  the  group 
will  be  treated  as  either  honest  or  dishonest,  trustworthy 
or  untrustworthy  as  a whole.  To  the  child  who  cheats,  con- 
tinued opportunity  and  continued  temptation  to  cheat  mean 
strengthening  the  habit  of  cheating.  In  this  way  our  schools 
may  and  often  do,  become  schools  of  crime.  On  the  other  hand, 
for  the  honest  child  to  be  watched  and  distrusted  relieves 

him  of  individual  responsibility  and  decreases  his  moral  self 

1. 

confidence.  " 

Measurement  eliminates  wastes  by  revealing  those  methods 

which  get  results  and  eliminating  those  that  mean  failure, 

therefore.’  it  prevents  waste  of  teaching  effort.  Gregory 

0. 

states,  "When  fully  considered,  much  of  the  great  waste  m 
education  is  due  to  our  lack  of  adequate  means  for  placing  re- 
liable estimates  on  our  results  and  processes.  We  lack  in  the 
matter  of  definite,  desirable, attainable  goals  to  be  sought 
through  a given  topic,  or  process,  or  stage  of  work  in  a given 
subject.  We  have  been  forced  to  work  in  a more  or  less  blind, 
do-and-trust-to-luck-way.  Whenever  the  application  of  scient- 
ific measurements  to  the  achievements  of  school  children  has 
been  made,  it  has  shown  that  great  waste  and  unbusiness  like 

1.  Ibid  page  4 

2.  Gregory,  C.A.  "Fundamentals  of  Educational  Measurement" 

pages  15,16 
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methods  are  being  practised*  A school  system  must  meet  the 
same  requirements  that  a business  corporation  must  meet.  The 
output  must  be  commensurate  with  the  expenditure." 

Many  are  pointing  accusing  fingers  at  the  church  and 
church  school  because  they  have  sent  their  boys  and  girls 
there  expecting  certain  results  in  their  lives.  When  these 
boys  and  girls  grow  up  to  be  irreligious , lax  in  morals , 
ignorant  of  the  virtues  which  the  church  is  supposed  to  teach 
and  prejudiced  against  the  church  and  its  purposes,  then  the 
church  school  can  well  be  accused  of  having  wasted  money, the 
time  and  effort  of  teachers  and  pupils,  and  the  character 
products  which  should  issue  in  the  lives  of  the  boys  and  girls. 
It  may  be  that  home  and  other  influences  have  counteracted  the 
church’s  influence,  but  who  can  say  until  the  church  can 
measure  the  results  of  its  teaching?  According  to  Gregory, 

"One  of  the  fundamental  principles  to  be  kept  in  mind  in 
the  solution  of  educational  problems  is  that  everything  with 
which  the  educator  works,  time,  energy,  money,  apparatus, 
resources  of  all  kinds,  are  so  limited  that  trusting  them  in 

the  hands  of  the  ignorant  cannot  be  endorsed,  and  prodigality 

1. 

with  them  is  crime.  " It  is  to  prevent  this  costly  waste 
and  to  know  wherin  we  succeed  or  fail  to  produce  Christian 
character  that  we  must  be  able  to  measure  the  results  of 
our  teaching. 

1. Gregory,  C.A.  "Fundamentals  of  Educational  Measurement" 
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(4)  Education  is  placed  on  a factual  basis  through  educat- 
ional measurement.  It  has  already  been  stated  that  many  school 
and  teachers  are  inefficient  because  it  is  not  known  which  pro 
cesses  will  produce  what  results  or  products.  In  public 
education  as  well  as  in  religious  education,  even  among  the 
leaders,  there  are  conflicting  opinions  and  much  disagreement. 
There  are  some  phases  of  education  which  should  be  a matter 
of  scientific  fact  and  there  are  some  phases  which  are  a 
matter  of  creed.  What  constitutes  a Christian  has  been  so 
wholly  a matter  of  creed, and  there  are  such  varying  and  con- 
tradictory creeds  and  philosophies  that  it  is  not  to  be 
wondered  at  that  there  are  conflicts  in  the  field  of  relig- 
ious education.  These  conflicts  are  not  limited  to  the 
field  of  religious  education,  for  they  rage  in  the  field  of 
public  education.  Professor  Gregory  says  that,  "I  think1,  ’I 
guess’,  'it  is  my  opinion'  are  the  common  and  characteristic 

phrases  in  education.  *1  know’  is  a phrase  that  has  scarcely 

1. 

been  admitted."  How  much  more  true  this  is  in  the  relig- 
ious field!  It  seems  necessary  to  leave  some  things  as  a 
matter  of  creed,  but  some  things  are  scientific  educational 
facts  and  should  be  given  credence.  Where  this  is  true  mere 
opinions  should  not  be  allowed  to  rule.  Is  it  not  foolish 
that  whenever  teachers  get  together  no  one  says  with  auth- 
ority that  this  method  or  procedure  will  accomplish  certain 
things  under  such  and  such  conditions  because  some  related 


1.  Ibid 
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things  are  scientifically  true?  They  are  always  saying.  What 
are  some  of  the  things  you  have  tried  and  how  did  they  work? 

So  one  will  say,  "I  always  put  an  athletic  man  in  as  teacher 
of  my  Junior  hoys,  because  the  boys  are  interested  in  athletics 
and  he  can  show  them  that  all  the  people  in  the  church  are  not 
sissies*’1  Another  will  say,  "But  I have  found  that  such  a 
person  never  teaches  them  any  Bible  so  I have  chosen  a woman 
who  is  very  good  on  Bible  drill  because  she  is  able  to  work 
up  enough  competition  to  make  memory  work  a pleasure."  And 
so  it  goes.  It  is  well  that  our  scientific  spirit  is  begin- 
ning to  ask  that  our  assertions  be  backed  up  by  facts.  We 
must  have  some  idea  of  the  nature  of  conduct,  of  motivation, 
of  the  value  of  information,  and  ideas,  of  the  pulling  power 
of  emotional  factors,  of  the  formation  of  habits  and  their 
control  in  behavior  and  conduct.  Not  until  then  will  educ- 
ation become  an  art  and  a science.  It  is  those  working  to 
develop  measurements  who  recognize  the  need  for  a factual 
basis  in  education  rather  than  an  opinion  basis.  If  we  guess 
we  are  making  progress  we  can  run  along  on~an  opinion  basis, 
but  when  we  want  to  know  and  apply  some  accepted  standards 
of  measurement,  opinions  must  fall  by  the  way  and  make  room 
for  facts. 

(5)  Measurement  tends  to  set  up  standards  in  education. 

This  problem  has  probably^ well  enough  argued  in  the  pre- 


ceding paragraph,  for  a factual  basis  will  act  as  a standard- 
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izing  factor.  This  may  "be  illustrated:  a teacher  with  little 
experience  and  too  little  grounding  in  psychology  and  educat- 
ion to  set  up  her  own  standards  will  nevertheless  be  guided 

in  her  procedures  and  processes  through  the  supervision 

adequate 

which  Ntests  and  measurement  impose.  If  she  knows  what  the 
tests  demand,  she  already  has  a guiding  basis  for  her  teach- 
ing. Furthermore,  tests  are  diagnostic  - of  the  teacher’s 
work  as  well  as  of  the  pupils’ work-  and  will  aid  the  teacher 
to  analyze  her  failings  and  successes,  which  means  that  she 
has  already  gone  a long  way  toward  correcting  her  own  faults. 
Without  tests  or  measurements  many  teachers  are  unable  to 
diagnose  or  analyze  their  oWn  teaching,  but  with  them  they  are 
helped  in  such  analysis  and  a standard  for  their  teaching  is 
set  up.  If  such  testing  is  used  through-out  a whole  school 
or  school  system  , it  tends  to  bring  the  whole  school  up 
to  the  same  standard,  and  to  set  a higher  standard, (perhaps 
unconsciously)  than  the  school  or  system  would  have  otherwise. 

This  second  problem  of  convincing  educators  and  ad- 
ministrators of  the  need  for  and  value  of  tests  and  measure- 
ment is  an  important  one.  In  summary  of  this  discussion  of  it> 
recall  that  measurement  is  just  as  important  in  the  field 
of  religious  education  as  in  public  education  ,and  so 
Professor  Gregory’s  reasons  for  the  use  of  tests  and  measure- 
ments in  education  are  found  applicable  to  the  field  of  re- 
ligious education.  In  brief  review,  when  adequate  measure- 
ment is  possible-  and  we  are  working  toward  that  ideal?  it  is 
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becauae  it  insures  progress,  it  makes  for  definiteness  in  aims 
and  teaching,  it  eliminates  waste,  it  places  education  on  a 
more  factual  basis,  and  sets  up  standards  of  education.  These 
have  been  found  necessary  to  efficiency  in  public  education, 
and  the  present  status  of  religious  education  makes  their 
use  imperative  in  that  field. 

Section  III.  The  Nature  of  Educational  Measurements,  and  their 
Application  to  the  Measurement  of  Bible  Teaching. 

Having  established  reasons  for  the  use  of  measurements 
in  : eligious  Education,  any  educator  who  wishes  to  use  them 
is  faced  with  the  necessity  to  discover  the  nature  of  these 
measurements.  There  are  two  general  kinds  of  measurements 
which  have  been  used  in  the  public  schools,  that  is,  sub- 
jective and  objective  tests.  Until  the  scientific  method 
was  applied  to  education,  the  subjective  method  was  domin- 
ant. Teachers  Measured  their  pupils1  achievements,  and 
marked  and  promoted  pupils,  on  the  basis  of  their  own  opin- 
ions, on  their  own  evaluation  of  a pupil’s  performance. 
Subjective  judgements  have  many  dangers  and  disadvantages. 

First  of  all,  a teacher  usually  makes  out  her  own  tests  , and 
the  important  abilities  are  not  always  chosen  for  measurement. 
Professor  Monroe,  in  discussing  this  subject  says,  "The 
field-  of  a school  subject  includes  a large  number  of  abilities. 
Careful  analyses  are  necessary  in  order  to  determine  what 
these  abilities  are,  and  which  ones  are  most  significant. 
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In  formulating  the  questions  of  ordinary  examinations  the 
teacher  does  not  usually  have  at  hand  a statement  of  the  im- 
portant abilities  within  fche  field  of  a school  subject,  and, 
as  a consequence,  some  of  the  important  abilities  are  fre- 
quently interpreted  as  being  measures  of  the  entire  field  of 
abilities. •• .The  making  of  satisfactory  measurements  is  de- 
pendent upon  a careful  analysis  of  the  subject  matter  field 
in  which  they  are  being  made.  The  ordinary  classroom  teacher 

does  not  have  at  hand  these  analyses  in  preparing  examinations 

1. 


and  in  assigning  school  marfts." 

The  Church  School  or  Bible  teacher  is  less  apt  to  have 
at  hand  a careful  analysis  of  the  abilities  within  the  field 
of  his  subject,  for  the  application  of  scientific  method  to 
this  field  is  more  recent,  and  as  discovered  in  the  section 
of  this  paper  given  to  objectives,  the  field  has  not  yet 
been  specifically  and  carefully  analyzed. 

In  the  second  place,  the  teacher’s  individual  judgement 

but  a fleeting  and  variable  memory 

is  too  subjective  because  she  has  nothing  vwith  which  to  com- 

pare  either  her  tests  or  their  results.  When  a teacher  says 
of  a particular  examination  paper  that  it  is  good,  what  is 
her  meaning  of  good?  There  should  be  a standard  or  norm 
which  may  be  used  as  a basis  of  comparison.  When  a teacher 
judges  a paper  to  be  good,  it  may  be  because  she  is  partial 
to  the  special  emphasis  given  in  that  paper,  although  the 
coneensus  of  opinions  from  many  experts  would  be  to  the 


r 


1. Monroe,  W.S.  "An  Introduction  to  the  Theory  of  Educational 
Measurements".  page  35 
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contrary.  Furthermore  this  norm  represent* the  average  ability 
of  many  pupils  so  that  any  one  pupil  may  be  measured  in 
terms  of  the  average.  Tests  made  and  graded  on  the  basis  of 
teachers*  individual  judgements  - or  even  by  local  super- 
visors - will  lack  the  objectiveness  which  standard  tests,  or 
more  objective  tests  will  give.  It  is  probable  that  there  will 
always  be  a place  for  individual  teachers  to  give  and  construct 
tests,  but  the  more  objective  she  can  make  these  tests,  so 
that  all  pupils  can  be  scored  on  the  same  basis  regardless  of 

who  grades  the  test,  the  better  she  can  compare  the  members  of 

1. 

the  group  with  each  other  and  establish  norms  for  comparison 
with  other  groups.  One  may  rightly  ask  how  this  applies  to 
the  field  of  religious  education  and  Bible  study.  Although  it 
is  not  most  important,  yet  a test  of  the  knowledge  gained  in 
a Bible  study  class  may  be, and  often  is  given,  and  object- 
ivity is  just  as  necessary  as  in  any  other  field  of  knowledge. 
Suppose  it  is  not  an  examination  paper  which  the  teacher  is 
evaluating  subjectively,  but  the  results  of  her  teaching  as 
found  in  the  attitudes  or  conduct  of  the  pupil.  What  would 
she  mean  if  she  classified  a pupil  as  "good”  in  his  attitude 
toward  prayer^  or  toward  wealth  and  possessions?  Without 
objective  testing  and  the  ability  to  compare  this  pupil 
with  the  average,  teachers  will  vary  in  their  rating  of  him, 
and  the  estimate  of  the  usual  teacher  has  a high  unreliability. 
Furthermore,  few  teachers  claim  to  have  any  knowledge  of  how 

1.  An  even  more  important  function  of  tests  is  the  comparison 
of  the  present  status  of  a pupil  with  his  former  status. 
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much  their  Bible  teaching  has  done  for  individual  pupils. 

Too  often  they  take  refuge  in  the  thought,  "I  have  planted 
the  seed  and  must  leave  the  harvest  to  the  Lord.”  Sometimes 
it  is  necessary  to  scan  the  harvest  in  order  to  test  the 
kind  of  seed  sown. 

A third  disadvantage  of  the  ordinary  test  is  its  in- 
ability to  completely  describe  the  pyjjil’s  abilities.  A teach- 
er seldom  computes  or  knows  the  difficulty  value  of  a 
question  in  her  test.  The  questions  will  seldom  be  of  equal 
difficulty,  although  they  may  be  graded  on  the  same  basis 
and  used  just  alike  in  determining  the  pupil* s achievement. 

If  a teacher’s  estimate  of  the  difficulty  of  questions  is 
unreliable,  she  is  not  able  to  describe  a pupil’s  abilities 
by  that  test.  If  a pupil  answers  four  of  the  easier  questions 
of  a test,  his  ability  may  still  be  less  than  that  of  the 
pupil  answering  only  three  questions,  if  these  three  should 

much  exceed  in  difficulty.  It  would  seem  then  that  the 

the  pupil’s  ability 

teacher  ought  to  evaluateon  the  basis  of  difficulty.  The 
results  of  tests  given  to  teachers  for  judging  difficulty, 
have  proven  that  even  the  average  judgement  of  twenty 
teachers  is  unreliable.  How  much  less  reliable  is  the 
judgement  of  one  teacher  I * A complete  description  of 
difficulties  would  include  not  only  difficulty  but  also 
3peed  or  rate.  The  amount  of  work  a pupil  can  do  in  a 
given  time  is  some  indication  of  his  ability.  Often, only 


1.  See  Monroe,  "Measuring  the  Results  of  Teaching"  pp  10-15 
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the  quality  of  the  answers  is  considered  when  speed  or  rate 
may  indicate  a truer  measure  of  his  ability. 

Quality  and  accuracy  seem  to  be  the  abilities  most 
often  measured,  but  the  variation  with  which  they  are  eval- 
uated indicates  the  fourth  danger, in  the  subjective  meas- 
urement. Investigations  have  been  made  of  the  ways  differ- 
ent teachers  mark  the  same  papers,  and  it  has  been  found  that 
the  marks  a pupil  receives  on  a paper  depends  upon  the 
teacher  who  grades  the  paper,  as  well  as  whatthe  pupil  has 

written  in  the  paper  • The  investigation  of  Starch  and 

I. 

Elliot  of  the  University  of  Wisconsin,  disclosed  that  when 
116  teachers  graded  the  same  geometry  paper,  two  teachers 
gave  a grade  above  90i*  twenty  gave  a grade  above  80, 
twenty  were  below  60,  and  one  below  30.  Forty-seven 
teachers  gave  a passing  mark  to  the  paper,  and  69  gave  it 
a failing  mark.  Similiar  results  have  been  obtained  in 
other  studies,  and  the  only  possible  conclusion  is  that 
marks  under  the  usual  conditions  are  highly  subjective  and 
therefore  do  not  give  accurate  measures  of  pupils1  abilities. 

Still  another  disadvantage  of  the  usual  test  or  exam- 
ination is  that  it  is  not  diagnostic  - much  less  prognostic. 
The  teacher’s  idea  is  to  cover  as  much  of  the  field  as 
possible.  This  means  a wide  range  of  topics,  but  these 
topics  have  no  definite  meaning  insofar  as  they  diagnose 
the  pupils’  particular  weaknesses  or  reveal  particular 

1.  Starch  and  Elliot,  "Reliability  of  Grading  work  in 
Mathematics"  School  Review,  Vol  XXI  pages  254-259 
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abilities.  If  a teacher  is  to  use  the  results  of  measurement 
to  guide  her  in  guiding  and  helping  the  child,  it  must  defin- 
itely reveal  something  concerning  the  status  and  needs  of  that 

child.  How  many  tests  are  so  constructed? 

type  of 

The  usual  essay^ examination  has  other  disadvantages  and 
limitations;  there  is  the  time  required  to  write  lengthy 
answers  which  means  that  the  sampling  must  be  limited  to  a 
small  number  of  broad  questions.  The  time  required  to  grade 
such  an  examination  is  another  handicap  These  criticisms  and 
most  of  those  above  apply  in  measuring  the  knowledge  acquired. 
It  is  well  to  remember  , however,  that  knowledge  is  just  one 
phase  of  our  desired  outcomes,  and  that  it  is  the  most  easily 
measured.  If  this  is  true  and  subjective  measurements  are  un- 
reliable, we  would  do  well  to  be  on  guard  against  accepting 
subjective  judgements  concerning  the  amount  of  change  in  the 
more  complex  results,  that  is,  in  the  attitudes  and  conduct 
which  we  desire  our  teaching  to  accomplish.  Teachers  are 
more  apt  to  ride  hobbies  in  these  fields  than  in  the  field 
of  knowledge  - to  be  partial  to  particular  traits  and  judge 
all  of  one's  work  on  the  basis  of  that  particular  trait  dev- 
elopment - and  he  is  more  likely  to  be  affected  in  his 
judgement  of  the  amount  of  thatitrait  by  his  liking  for  the 
pupil. 

If  the  usual  methods  of  measurements  are  highly  sub- 
jective and, therefore , are  not  able  to  secure  the  results 
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that  measurements  should  secure,  it  is  time  that  educators 
seriously  consider  the  more  objective  testd  and  some  of  the 
standard  tests.  To  construct  such  objective  and  universal 
scales  is  the  great  problem  in  measurement  - and  in  education 

today.  Before  we  proceed  farther  it  is  well  to  define  what 

ments 

is  meant  by  objective  measurements.  Objective  instru-A  are  of 
two  kinds , standardized  tests,  and  unstandardized  objective 
tests.  A standardized  test, as  the  name  implies,  is  provided 
with  norms  or  standards  of  achievement.  It  should,  however, 
to  meet  accepted  requirements,  have  certain  other  qualities. 
First  of  all,  it  is  objective,  that  is,  in  scoring,  personal 
opinion  is  eliminated,  and  with  a scoring  key  accompanying  the 
test,  anyone,  anywhere,  scoring  the  test  should  get  the  same 
results.  Moreover,  directions  should  accompany  the  test  so 
that  it  will  always  be  given  under  the  same  or  like  conditions 
in  order  to  insure  the  same  results  regardless  of  who  is  giving 
the  test,  or  where.  The  test  should  be  simple  enough  to  be 
easily  understood.  It  should  measure  well  some  one  thing, 
and  measure  accurately  whatever  it  does  measure.  This  is 
known  as  reliability.  some  extent)  the  t est  should  measure 
what  it  purports  to  measure  and  this  should  be  something  of 
worth  and  having  value  to  the  child  or  to  the  teacher  in  his 
teaching.  In  so  far  as  a test  has  these  qualities  and  meas- 
ures what  it  purports  to  measure  it  has  validity.  The  norms 
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which  accompany  a standardized  test  should  reveal  the  aver- 
age achievement  some  group,  so  that  it  may  be  used  by  any 
similiar  group  for  comparison.  Any  test  which  is  not  stand- 
ardized should  have  a degree  of  these  same  qualities  al- 
though it  need  not  have  the  norms  or  standards  which  charact- 
erize the  standardized  test. 

The  advantages  of  the  objective  tests  over  the 
more  subjective  or  old-type  examination  have  been  suggested 

in  the  criticisms  of  the  latter.  A brief  summary  of  the  ad- 

(objective  tests) 

vantages  follows:  they* are  more  objectively  scored  and, 

therefore,  avoid  the  dangers  that  we  learned  existed  in  the 
subjective  scoring.  Objective  estimates  of  attitudes  and 
conduct  are  more  reliable  than  subjective  opinion.  In  written 
tests  there  is  a great  saving  of  time  which  provides  for  a 
more  extensive  sampling  of  a pupil’s  knowledge  or  ideas.  This 
means  a higher  reliability  of  estimate  of  his  abilities  for 
the  time  in  which  he  works.  A pupil’s  ability  as  described 
in  100  test  items  is  a more  reliable  index  than  his  ability 
in  ten  items.  The  objective  test  makes  possible  a more  re- 
liable diagnosis  of  pupils'  weaknesses  and  successes,  a 
more  probable  basis  of  prediction  , and  a greater  control 
of  the  examination  by  the  teacher  for  it  avoids  ambiguity 
in  the  question  and  in  the  pupil’s  interpretation  of  the 
question.  Since  it  controls  the  pupils’  avoidance  of  the 
real  question,  there  is  greater  freedom  from  bluffing. 
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The  standardized  test  has  also  this  other  advantage^  that  it 
is  more  apt  to  he  constructed  by  experts  and  consequently 
measures  the  most  important  abilities  more  reliably*  The 
concensus  of  opinion  which  often  enters  into  the  construction 
of  such  a test  makes  it  of  greater  value.  With  many  stand- 
ardized tests  there  are  duplicate  forms  which  are  a great  help 
to  the  teacher  in  emergencies,  or  for  discovering  the  amount 
of  increase  in  the  thing  being  measured. 

After  one  has  discovered  the  advantages  of  objective 
testing,  he  is  faced  with  the  problem  of  getting  acquainted 
with  the  types  and  nature  of  objective  tests  already  in  the 
field.  We  may  classify  these  tests  according  to  what  they 
measure,  or  we  may  classify  them  according  to  technique.  We 
have  already  suggested  that  measurement  may  be  classified 
on  the  basis  of  the  type  of  learning  involved  for  this  deter- 
mines not  only  what  is  to  be  measured,  but  it  determines 
and  limits  the  kinds  of  measurement  that  may  be  used.  Here 
we  will  let  what  we  want  to  measure  determine  our  procedure. 

One  of  the  most  common  results  expected  from  learning 
is  known  as  knowledge.  Knowledge  involves  the  acquisition 
of  information  or  facts,  of  ideas,  of  vocabulary,  and  of 
skillin  handling  these.  There  are  several  paper  and  pencil 
techniques  developed  which  are  now  in  quite  common  use. 

These  may  be  classified  as  follows: 

1.  Recall  Types.  There  i3  the  simple  recall  type  of  question, 

(1)  See  Ruch,  The  Objective  or  New  Type  Examination,  Ch.8 

or  Watson,  "Experimentation  and  Measurement”  Ch  5 


such  as , 


Jesu3  was  born  in  the  town  of  • 

The  answer  is  not  suggested  and  cannot  be  recognized,  but  must 
be  so  thoroughly  known  that  the  pupil  can  recall  from  his 
memory  the  required  information.  The  statements  must  be  clear. 

To  illustrate,  if  the  statement  read}  Jesus  was  born  in • 

the  answer  might  be  Palestine,  Judea,  or  stable,  so  the 
pupil  might  give  a correct  answer,  and  still  we  would  not 
know  whether  he  knew  knew  the  town  of  Jesus’  birth. 

Somewhat  like  the  Simple  Recall  is  the  Completion 
type  of  test,  for  it, too,  requires  recall  without  suggesting 
possible  answers.  The  content  of  this  test  may  be  written 
in  sentence  or  paragraph  form  and  key  facts  left  blank  to 
fill  in.  A sample  of  this  technique  follows: 

Pill  in  the  blanks  with  the  missing  words.  Write  the 
words  which  will  make  the  truest  statement. 

"The  Jews  believed  that  their  race  was  founded  by  the 
great  patriarch  named  (1) • His  son  was  called 

(2) , and  his  grandson,  after  whom  the  children 

1. 

of  Israel  were  named,  was  called  (3) ." 

This  form  of  the  Completion  test  is  called  by  Dr. 

2. 

Watson  the  Word-phrase-Answer  Question  because  there  is 
only  one  word  to  be  filled  in.  In  some  Cdmpletion  tests 
there  is  more  than  one  word  omitted.  The  Completion  tests, 
like  the  Simple  Recall,  must  make  certain  that  only  one  cor- 
rect answer  may  be  inserted  in  the  blank  and  this  makes  them 

1.  Union  Test  of  Religious  Ideas  Porm  II  High  School  Age 

2.  Watson  G.B.  "Experimentation  And  Measurement"  page  134 
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difficult  to  construct.  Ambiguous  questions  are  apt  to  es- 
cape detection  even  when  examined  by  several  people.  One 
must  be  careful, too#  to  leave  out  words  which  are  really 
significant  and  which  really  test  the  point  at  issue.  When 
Recall  tests  are  well  made,  they  appear  to  be  the  most  reli- 
able tests  of  information. 

The  True-False  technique  gives  a statement  that  may 
be  either  true  or  false#  and  the  subject  is  asked  to  indicate 
which  it  is  by  encircling  the  word  true  or  the  word  false, 
or  the  letters  T or  F,  or  by  marking  f for  true  and  - for 
false.  Sample  statements  illustrate  the  true-false  technique 

1.  Peter  was  known  as  the  beloved  disciple  - - True  false 

2.  The  phasisees  were  followers  of  Jesus  - - - True  False 

5.  Paul  had  the  priveleges  of  Roman  citizenship* True  False 

In  constructing  such  a test,  caution  is  needed  to  avoid 
extreme  statements,  to  avoid  confusing  double  negatives,  and 

to  avoid  using  double-barreled  sentences  which  combine  since 
therefore,  or  inasmuch,  for  if  two  elements  are  being  tested, 
one  may  be  true  and  the  other  false.  Each  statement  should 
be  direct  and  simple.  True-False  questions  are  used  pri- 
marily to  test  information,  but  may  also  be  used  to  test 
high  degree  of  insight  into  relationships  and  causes,  or 
ability  to  make  discriminations.  Adelaide  Case’s  "Test  of 
Liberal  Thought"  illustrates  thi3  use  of  the  True-False 
technique.  Sample  statements  from  it  follows 

1.  A Christian  is  well  defined  a3  a person  who 

l.Laycock,  S.R.  "The  Laycock  Test  of  Biblical  Information" 


believes  in  Jesus  Christ  as  God  and  Savior  - True  False 

2.  All  Jews  will  try  to  get  the  best  of  a bargain 

even  if  they  have  to  cheat  to  do  it  - True  False 

3.  The  church  is  Christian  only  in  so  far  as  it 

seeks  actually  to  live  out  the  belief 

that  God  is  our  Father  and  all  men  are 

brothers.  -------------  True  False. 

The  Yes-No  technique  is  almost  equivalent  to  answering 
true  or  false.  A sample  will  illustrate: 

"Below  are  some  questions  asking  what  you  think  about 
God.  If  you  think  the  answer  is  "yes",  draw  a circle  around 
the  word  "yes".  If  you  think  the  answer  is  "no"  then  draw 
a circle  around  the  word  "No". 

1.  Do  you  think  that  God  is  very  strong,  so  that  he 


could  pick  up  the  whole  world?  --------  Yes  No 

2,  Do  you  think  God  made  the  whole  world,  all  the 

planets,  animals  and  people?  --------  Yes  No 

3.  Do  you  think  God  is  like  a very  old  man?  - - - Yes  No 


Also  associated  with  the  true-false  technique  is  the 
statement  to  be  marked  right  or  wrong.  It  is  so  much  like 
the  Yes-no  technique  that  an  illustration  here  is  unneces- 
sary. The  same  cautions  must  be  observed  in  constructing 
the  Yes-No  and  the  Right-Wrong  tests  as  suggested  for  t&e 
True-False  tests. 

The  third  type  of  technique  and  probably  the  most 
ctenon  is  that  of  the  Multiple  Response,  or  Multiple  Choice 
tests.  These  may  range  all  the  way  from  two  to  six  or  more 


1.  The  Union  Test  of  Religious  Ideas  Form  I for  Children 
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responses*  They  are  sometimes  called  the  "Best  Answer"  tests 
for  the  subject  selects  from  the  responses  or  possible  answers 
given,  the  one  or  ones  which  he  thinks  best.  Examples  of 

these  various  forms  follow; 

1. 

Two  Response: 

"Below  are  listed  several  reasons  why  some  people  feel 
that  Jesus  should  be  given  our  allegience,  why  we  should 
claim  him  as  Lord,  as  Master,  as  Son  of  God,  and  as  our 
Savior.  Read  each  and  if  you  think  it  is  a strong,  important 
reason  for  giving  Him  this  loyal  allegeince,  draw  a line 
around  the  word  STRONG.  If  you  think  it  is  a weak  or  less 
important  reason,  draw  a,  line  around  the  word  WEAK. 

1. Because  he  is  reported  to  have  been  born  of 


a virgin  -------------  - -Strong  Weak 

2.  Because  he  always  lived  unselfishly  - - - Strong  Weak 

3.  Because  he  had  power  to  heal  many  forms  of 

sickness  ----------------  Strong  Weak 

4.  Because  he  claimed  to  be  the  Messiah  - - Strong  Weak 

5.  Because  all  he  did  and  said  was  moved  by 

a spirit  of  real  love  ---------  strong  Weak 

etc. 

2. 

Three  Response: 

"Make  a cross  before  the  best  answer  to  each  question: 


1.  If  a playmate  hits  you  without  meaning  to  do  it,  you 

should:  ( ) hit  him  back 

( ) Make  him  say  he’s  sorry 
( ) excuse  him.  " 


Jl)  Union  Test  of  Religious  Ideas  Form  II  page  5 
(2)  S.C.Kohs,  Ethical  Discrimination  Test. 
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Another  sample  of  the  three  response  test  may  be  taken 

1. 

from  the  Union  Test  of  Religious  Ideas  . It  is  often  called 
the  Degrees  Type  of  test. 

"Below  are  listed  some  brief  prayers,  Read  each  carefully 
and  then  if  you  think  it  is  an  excellent  type  of  prayer,  ty- 
pifying the  genuine  Christian  ideal,  draw  a line  around  the 
word  excellent.  If  you  think  it  is  allright,  but  is  not  the 
best  type  of  prayer,  then  draw  a line  around  the  word  Fair. 

If  you  think  it  is  a poor  type  of  prayer,  unworthy  of  a 
person  trying  to  pray  in  the  spirit  of  Jesus,  then  draw  a 
line  around  the  word  poor. 

1.  Father  forgive  them  for  they  know  not  what  they  do 

Excellent  Fair  Poor 

2.  Lord  Bles3  me  and  my  brother. 

And  my  father  and  my  mother. 

Amen  ----------  Excellent  Fair  Poor 

3.  Help  us,  0 Father,  to  play 

this  night  the  best  basket- 
ball we  know  how  to  play. -Excellent  Fair  Poor 
etc. 

Four  Response:1 2 

"Choose  t&e  one  answer  which  makes  the  truest  or  best 
statement,  and  place  an  X in  front  of  this  answer. 

1.  Christ  was  born  in 

Jerusalem,  Bethlehem. Nazareth. Capernaum. 

2.  Herod  asked  the  Wise  lien  to  tell  him  where  they  found 
Jesus,  so  that  he  might  worship  him,  but 


1.  Union  Test  of  Religious  Ideas,  Form  II,  High  School 

2.  Hanson,  W»L.  Church  School  Examination  Alpha,  Form  I. 
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they  did  not  find  Jesus 

they  did  not  obey  Herod 

they  did  not  know  where  to  look  for  Jesus 

they  thought  Herod  was  crazy 

Five  Response:  ( Moral  Judgement  Test  - PressV) 

"Which  is  the  worst?. . .Draw  a line  under  the  worst. 

1.  fighting, killing,  hating,qualleling,  hurting. 

2.  borrowing,  gambling, over charging,  stealing,  begging. 

3.  love,  hate,  fondness,  dislike,  liking. 

This  type  of  test  is  often  called  the  Cross-out  test. 

1. 

Six  Response: 

"Most  people  agree  that  swearing  is  an  undesirable  habit. 
Below  are  reasons  sometimes  given  in  support  of  this  view. 

Read  them  all  carefully.  Then  make  a check  mark  in  front  of 
the  one  which  seems  to  you  the  best  reason  of  all.  Check 
only  one. 

the  Bible  says  it  is  wrong 

it  is  a bad  example  for  smaller  boys 

it  is  an  expression  of  strong  emotion,  and  all  violent 

emotion  is  undesirable. 

it  is  unclean 

it  develops  the  habit  of  loose  thinking. 

it  shows  a lack  of  self-control. 

Now  read  these  reasons  all  again.  Cross  out  that  one  which  you 
think  the  poorest  and  weakest  of  all.  Cross  out  only  one." 


1.  Watson,  G.B.  "Experimentation  and  Measurement"  page  136 
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When  the  subject  is  instructed  to  select  the  best  answer, 
the  technique  is  often  used  to  teat  discrimination,  religious 
ideas,  ethical  judgement,  or  insight  into  relationships,  and 
so  is  useful  for  more  than  testing  information.  We  have  already- 
stated  that  knowledge  is  more  than  information  of  facts.  It  is 
not  only  information  and  ideas,  but  the  ability  to  use  these 
ideas  and  facts  in  making  right  judgements,  in  discrimination, 
and  in  solving  moral  problems.  This  is  still  distinct  from 
attitudes  or  conduct  for  ethical  and  moral  judgement  is  largely 
a matter  of  knowledge  and  intelligence. 

Professor  Ruch  has  summarized  the  merits  of  the  Multiple 

1. 

Choice  tests: 

1.  Fairly  easy  to  construct. 

2.  Purely  objective 

3.  Usually  more  reliable  than  True-false  tests,  but  ord- 
inarily not  so  reliable  as  well  constructed  simple- 
Recall  tests  when  equal  numbers  of  items  are  considered. 

4.  May  be  made  to  test  reasoning  as  well  as  facts. 

5.  A sufficient  number  of  statements  can  be  used  to  elim- 
inate guessing  to  any  desired  degree ( within  pract- 
ical limits) 

The  last  statement  would  indicate  that  more  responses 
are  better  than  a few  - other  things  being  equal.  In  other 
words,  four  to  six  responses  minimizes  the  chances  to  guess 
the  right  answer  and  would  have  preferences  over  a test  with 
two  or  three  possible  responses.  On  the  other  hand,  we  must 
take  into  consideration  that  it  is  often  difficult  to  find 
four  or  more  responses  which  are  reasonable  and  plausible, 
and  still  have  varying  shades  of  difference.  To  be  really 


1.  Ruch,§.M.  The  Objective  or  Hew  Type  Examination,  page  274  ff 


analytical,  each  response  should  he  significant,  that  is 
graded  to  indicate  the  amount  of  judgement,  as  well  as  the 
kind  of  judgement  or  kind  of  ideas* 

Children  seem  to  enjoy  the  Matching  and  Pairing  type 
of  test  for  it  has  the  same  fun  values  as  a puzzle.  Usually 
one  set  of  ideas  is  given  and  these  are  to  he  paired  with 
those  in  a second  set.  Sometimes  names  and  dates  are  matched, 
or  it  may  he  phrases  and  hooks  of  the  Bible, ?r  causes  and 
effect.  The  illustration  below  is  composed  of  phsases 
and  hooks  of  the  Bible  and  presumably  measures  one’s  in- 
formation as  to  where  the  quotations  are  found  or  his  skill 
in  assigning  certain  quotations  to  their  rightful  source. 

In  the  first  column  are  books  of  the  Bible;  in  the 
second  column  are  quotations  from  these  hooks.  Before  each 
quotation  write  the  number  of  the  book  from  which  it  is 

"Be  ye  therefore  perfect  even  as 

your  Father  in  heaven  in  perfect.  " 

"To  Him  that  knoweth  to  do  good  and 

doeth  it  not,  to  him  it  is  sin." 

" For  God  so  loved  the  world  that  He 

gave  his  only  begotten  Son" 

" Beloved  now  are  we  the  sons  of  God? 


taken. 

1.  John 

* . . 

2.  Matthew 

•> 

3.  Proverbs 

c 

4.  Psalms 

5.  Jeremiah 

6.  James 


7.  I John 

8.  II  Corinthians 


"I  will  lift  up  mine  eyes  unto  the  hills 

from  whence  cometh  my  strength." 

"In  the  beginning  God  created  - 


9.  Genesis 
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Professor  Ruch  has  given  both  the  advantages  and  lim- 

1. 

itations  of  this  type  of  tests.  The  technique  has  merit  in 
that  it  is  purely  objective,  easily  constructed  for  certain 
types  of  subject  matter,  rapidly  scorable,  may  be  used  to 
measure  either  factual  mastery  or  judgement,  and  chance 
successes  may  be  avoided  by  using  ten  or  more  pairs*  or  in- 
complete matchings.  It  is  limited  because  much  subject 
matter  does  not  lend  itself  to  this  method,  chance  enters 

appreciably  in  success  or  failure  if  five  or  fewer  pairs 

partly 

are  used,  ( although  this  may  beA overcome  by  using  an  ex- 
cess of  statements  in  one  column  or  the  other),  and  very 
long  exercises  are  wasteful  of  pupils1  time  in  searching 
out  the  proper  pairings. 

Ranking  questions,  though  offering  opportunity  for 
a multiple  choice  response >are  somewhat  different  in  the 

type  of  action  they  demand,  professor  V/atson  well  des- 

2. 

cribes  this  test,  " The  subject  is  asked  to  arrange  in 
order  of  goodness  or  aptness  or  truth  or  preference  certain 
statements  or  descriptions.  There  may  be  only  two  of  them, 
or  there  may  be  as  many  as  ten.  In  practise  it  appears  that 
children  in  grade  schools  cannot  effectively  bear  in  mind 
more  than  five  proposals  they  are  to  rank.  Ranking,  of 
course,  involves  comparing  each  one  with  every  other  one. 

Ten  is  as  many  as  an  adult  can  handle  comfortably.... 

Again  in  ranking  questions  it  i3  necessary  to  watch  care- 

1. Ruch,  G.M.  "The  Objective  or  New  Type  Examination"  page  276  f 

2.  Watxon,  G.B.  "Experimentation  and  Measurement"  pages  142  ff 
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fully  the  two  dangers  of  multiple  choice  questions:  the  first 
that  suggestions  may  be  so  unusual  that  no  one  would  ever 
seriously  consider  putting  them  anywhere  except  at  top  or 
bottom;  the  second,  that  some  may  be  so  close  together  that 
it  is  impossible  to  decide  clearly  and  fairly  which  should 
come  first, " 

1. 

Some  of  the  illustrations  given  by  Watson  follow: 

1.  Alternatives:  read  each  pair  and  place  a check  in  front  of 
the  one  of  the  two  which  seems  to  you  to  be  more  Christlike. 

Christians  should  try  to  convert  all  nations  to  their 

belief, 

"Heathen  nations"  should  keep  the  good  things  in  their 

faith  and  only  take  the  things  from  us  that  will  en- 
able them  to  make  it  better, 

2.  Below  are  four  ways  of  acting  in  any  one  instance.  Read 
all  four  ways  of  acting,  and  put  a (l)  in  front  of  the 
one  you  feel  to  be  the  best,  a (2)  in  front  of  the  next 
best,  a (3)  in  front  of  the  next  best,  and  finally,  a 
(4)  in  front  of  the  one  that  you  think  would  be  the 
poorest  way  of  acting. 

a.  A boy  has  taken  his  older  brother’s  fountain  pen 
and  kept  it: 

He  should  be  severely  punished 

He  should  be  given  a chance  to  earn  one  for 

himself . 


1,  Ibid,  pages  142-144 
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He  should  be  lectured  severely 

He  should  be  given  one  of  his  own, 

3,  In  the  instance  below,  rank  the  suggested  answers.  Place 
(l)  in  front  of  the  best,  (2)  in  front  of  the  next  best, 

(3)  next,  then  (4), (5), (6), (7)  and  (8)  in  order,  placing  (9) 
finally  in  front  of  the  worst  or  poorest  suggested  answer. 

Be  sure  every  one  is  ranked. 

Rank  the  following  methods  of  keeping  Sunday  from  the 
standpoint  of  their  importance  to  a present  day  Christian 
boys 

a • not  making  any  noise 

b.  washing  dishes  for  your  mother 

c.  going  to  church  on  Sunday 

d.  reading  the  comic  section  of  the  Sunday  paper 

e.  playing  on  a community  baseball  team 

f.  praying  a great  deal 

g.  eating  a big  dinner 

h.  wearing  your  best  clothes 

i.  reading  a good  book. 

The  ranking  technique  may  measure  certain  kinds  of  know- 
ledge, especially  ideas,  judgements,  and  skill  in  associat- 
ing ethical  responses  with  certain  situations.  A technique 
which  has  value  in  discovering  a subjects  opinion  as  to 
what  he  thinks  best  to  do  in  certain  situations  begins  to 
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border  on  the  qualifications  for  a technique  which  will 
measure  more  than  knowledge*  namely  attitudes,  interests 
and  ideals.  Since  we  have  named  the  most  outstanding  tech- 
niques for  measuring  the  knowledge  factors,  the  kind  of 
measuring  instruments  necessary  for  measuring  the  more 
dynamic  factors,  such  as  ideals,  attitudes,  interests,  and 
appreciations  or  prejudices,  will  be  considered  next. 

The^lp  measurement  of  these  dynamic  factors  offer 
many  difficult  problems  because  of  their  complexity.  These 
problems  will  be  discussed  in  test  construction,  so  here  a 
survey  of  the  techniques  and  types  of  tests,  which  make  an 
attempt  to  measure  attitudes  and  the  other  dynamic  factors, 
must  suffice.  We  have  already  seen  that  the  ranking  technique 
seems  to  have  value  in  discovering  a subject’s  opinions, and 

if  he  tells  the  truth,  his  expressed  opinion  ought  to  be 

1. 

somewhat  expressive  of  his  own  attitude.  It  is  desirable 
to  have  test  forms  which  do  not  put  subjects  on  their  guard 
for  they  are  more  likely  to  give  real  information  as  to 
what  they  would  do.  To  this  end,  the  most  useful  tests  are 
those  which  ask  what  other  people  would  do, and  tests  which 
ask  for  indirect  clues,  hoping  that  these  will  give  some 
guidance  as  to  what  a pupil  would  really  do  without  his 

2. 

being  aware  that  he  is  answering  that  kind  of  question. 

Most  test  makers,  realising  the  complexity  of  attitude, 

1.  See  Thurstone  and  Chave,  "The  Measurement  of  Attitude" 

pages  6-8 

2.  See  Watson,  G.B.  "Experimentation  and  Measurement"  p.146 
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do  not  call  their  tests  attitude  tests,  hut  rather  an  attempt 
to  measure  some  one  phase  or  expression  of  a man's  attitude* 

A man's  opinion  is  one  significant  expression  of  his  attitude 


Watsons,  "Survey  of  Public  Opinion  on  Some  Religious  and 

Economic  Issues"  uses  several  techniques  in  a battery  of 

inconsistency  of 

tests  to  secure  some  measure  of  the  /\  public  opinion.  These 
techniques  are  the  most  commonly  used  in  measuring  this 
phase  of  attitude,  so  they  are  described  here. 

Form  A of  this  scale  is  a Cross-out  test.  The  Cross- 
out  techniques  was  illustrated  in  Koh's  fithical  Discrimin- 
ation Test  as  a type  useful  for  measuring  moral  knowledge. 

It  has  a slightly  different  use  here.  A list  of  fifty-one 
words  are  given,  the  subject  is  asked  to  read  the  list,  and 
consider  each  word  quickly,  and  if  it  suggests  more  that  is 
disagreeable  than  is  agreeable,  to  cross  it  out.  He  is  asked 
to  work  rapidly  but  to  be  sure  to  cross  out  every  word  which 
is  more  annoying  than  pleasing,  more  antagonizing  than  appeal- 
ing, more  distasteful  than  attractive.  The  first  ten  words 
in  the  list  follow: 

Bolshevist,  Cigarettes 

Mystic,  Religious  Creed 

Sunday  Blue  Laws  Fundamentalist 

Roman  Catholic  Big  interests 

Higher  Criticism  Birth  control. 

The  next  technique^  used  in  this  "Survey"  is  a Degree  of 


. 


't 


Truth  Test.  The  directions  reads  "No  one  knows  just  what  the 
American  people  are  thinking.  There  is  need  to  find  out  just 
what  convictions  are  most  firmly  held  on  some  disputed  issues. 
Indicate  your  opinion  about  each  of  the  statements  on  the 
following  pages  by  drawing  a circle  around  one  of  the  numbers 
in  the  margin  which  expresses  your  judgement."  The  numbers 
given  are  +2,  41,  0,  -1,  -2, 

42  is  marked  if  the  subject  thinks  the  statement  is  utterly 
and  unqualifiedly  true. 

41  is  marked  if  he  feels  that  it  is  probably  true  or  true 
in  large  degree. 

0 , if  he  feels  that  it  is  quite  undecided,  an  open  question, 
or  one  upon  which  he  is  not  ready  to  express  an  opinion. 
-1,  if  he  feels  that  it  is  probably  false  or  false  in  large 
degree. 

-2,  if  he  feels  that  the  statement  is  utterly  and  unqual- 
ifiedly false. 

Some  of  the  statements  are  as  follows: 

l.The  churches  are  more  sympathetic  with  capital  than 

with  labor. 

2.  Dancing  is  harmful  to  the  morals  of  young  people. 

3.  Jesus  was  more  interested  in  individual  salvation 

than  in  social  reconstruction. 

Another  degrees  technique  is  used  in  the  same  "Survey" 
in  a generalization  test.  In  this  test  the  statements  should 
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be  preceded  by  one  of  the  following  words:  All,  most,  many, 
few,  no.  The  blank  preceding  the  statement  leaves  room  for 
the  subject  to  choose  which  of  the  suggested  words  should 
begin  the  statement.  The  test  is  written  in  this  form: 

1. All  Most  Many  Few  Mo  ministers  of  churches  lead 

rather  lazy  lives. 

2.  All  Most  Many  Few  No  communists  are  men  of  high 

ideals. 

3.  All  Most  Many  Few  No  ills  of  the  body  can  be  cured 

by  prayer. 

The  subject  is  asked  to  draw  a circle  around  the  one 
which  best  expresses  his  own  conviction. 

Other  Degrees  tests  which  may  be  found  allow  the  subject 
to  classify  statements  as  bfcing  in  one  of  three  or  four  or 
five  classes.  It  may  be  regarded  as 
Excellent  Good  Fair  Poor  or 

Like  very  much,  like,  Neither  like  nor  dislike,  Dislike, 
Dislike  Very  much;  or 

Absolutely  essential.  Very  desirable,  Unimportant,  harmful 
The  advantage  of  the  Degrees  type  over  the  Multiple  Choice 
fcype  of  test  is  that  the  subject  responds  to  every  statement 
in  some  way. 

An  Inference  Test,  also  given  in  Watson1 s” Survey”  is 
more  like  the  Multiple  Choice  test,  which  has  already  been 
described.  To  illustrate  how  this  is  used  to  secure  opinion, 
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rather  than  factual  knowledge,  I shall  quote  from  Watson* 3 
test.  "Mere  facts  may  mean  different  things  to  different 
people.  It  is  often  important  to  know  just  what  people  think 
certain  facts  mean.  In  the  following  pages  you  will  find 
several  statements  of  fact,  and  after  each,  some  conclusions 
which  some  people  would  draw  from  them.  Put  a check  in  front 
of  each  conclusion  that  you  believe  is  fairly  based  upon  the 
fact  as  given  here.  Do  not  assume  anything  else  than  the 
evidence  given  in  the  statement  here,  with  all  its  terms 
understood.  You  are  not  to  consider  whether  the  conclusions 
are  right  or  true  in  themselves,  but  only  whether  they  are 
rightly  inferred  from  the  facts  given  in  the  statement. 

II  . A Young  Christian  was  driven  out  of  his  job  by  his 
Socialist  fellow- workmen  in  a factory  at  Frankfort  A.M., 
Germany,  because  he  refused  to  give  up  his  allegience  to 
the  Christian  faith.  The  young  man,  trying  to  find  afaother 
job,  broke  down  in  health  and  finally  died  with  the  •'flu*' . 

1.  Some  Socialists  disliked  the  Christians 

2.  The  young  man  patiently  bore  his  cross  as  a martyr 

for  Jesus  Christ. 

3.  There  is  less  and  less  place  for  pious  Christians  in 

thi3  matter  of  fact  world. 

4.  The  young  man  was  very  annoying  in  the  manner  in  which 

he  thrust  his  religion  in  the  face  of  others. 

5.  The  world  would  be  worse  than  it  is  today  if  the 

Socialists  could  run  everything  their  way. 
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6.  Germany  is  not  a Christian  nation, 

7.  Workmen  are  apt  to  be  of  low  intelligence,  or  of  low 
moral  ideas  or  both. 

8.  None  of  these  conclusions  can  fairly  be  drawn. 

The  scoring  plan  for  the  battery  of  tests  used  in  this 
Watson  Survey  of  Public  Opinion,  is  so  arranged  that  one  may 
discover  the  direction  in  which  the  individual  is  most 
likely  to  register  prejudice.  Each  reaction  is  scored  in 
such  a way  that  it  may  be  classified  as  in  general  agreement 
with  one  or  more  of  twelve  suggested  lines  of  bias,  as  follows 

I.  In  agreement  with  economic  radicals. 

II.  In  agreement  with  economic  liberals,  favoring  mild  econ- 
omic reforms. 

III.  In  agreement  with  economic  capitalists,  favoring  the 
status  quo,  opposing  economic  radicals. 

Iv.  In  agreement  with  a "social"  interpretation  of  religion, 
as  contrasted  with  an  emphasis  on  personal  communion. 

V.  In  agreement  with  a "personal"  religion,  mysticism,  in- 
dividual communion,  salvation,  etc. 

YI.  In  agreement  with  orthodox  Apostle’s  Creed  fundamentalist 

VII.  In  agreement  with  "modernists",  liberal  Christians. 

VIII.  In  Agreement  with  religious  radicals,  possibly  in  oppos- 
ition to  religious  forms  and  ideas. 

Ix.  In  agreement  with  protestants  rather  than  with  Homan 
catholics. 

1.  Analytical  Score  Sheets  for  the  Watson  Test  of  Public 


Opinion,  page  2 
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X.  In  agreement  with  Roman  Catholics  rather  than  with 
Protestants. 

XI.  In  agreement  with  strict  moral  standards,  especially  on 
questions  of  amusement,  sex-conduct,  intemperance. 

XII.  In  agreement  with  free , liberated  moral  standards,  oppos- 
ing censonship,  etc. 

One’s  prof ile, according  as  he  agrees  with  each  of 
these  lines  of  bias,  is  made  after  the  scoring  of  the  bat- 
tery of  tests. 

In  some  tests,  to  get  a norm  for  rating  the  attitude 
of  a person  in  terms  of  the  group,  the  attitude  of  many 
people  is  found  by  taking  the  mean  attitude  of  a group. 

Then  every  individual  is  rated  in  terms  of  this  mean.  It 
is  readily  recognized  that  much  subjective  judgement 
enters  into  such  a rating. 

The  True-false  technique  has  value  in  discovering  at- 
titude as  used  in  Case’s  test  of  Liberal  Thought.  Hotice 
how  the  marking  of  these  few:  statements  will  reveal  the 

attitude  of  the  subject  concerning  his  liberality  or  narrow- 

1. 

ness,  prejudice  or  f airmindedness. 

1.  For  Jesus  the  main  test  of  discipleship  lay  in 

belief  in  his  miracles  -----------  True  False 

2.  Children  are  able  to  help  in  the  making  of  im- 

portant family  decisions.  ---------  True  False 


l.Case,  Adelaid  Teague,  ”A  Test  Of  Liberal  Thought” 

questions  35  to  39 
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3.  Some  women,  whose  husbands  are  able  to  support  them, 
are  nevertheless  justified  in  holding  regular 


paid  positions.  --------------  -True  False 

4.  The  conception  of  God  remains  unchanged 

through  out  the  Bible  -----  ------  -True  False 

5.  Education  is  the  imparting  of  knowledge  - - - True  False 

Another  technique  for  measuring  attitude  is  found  in 


Thurwtone  and  Chave's  "Scale  for  Measuring  Attitude  Toward 
The  Church".  The  subject  is  asked  to  check  every  statement 
with  which  he  fully  agrees.  Some  of  the  statements  follow: 

1.  I think  the  church  i3  a divine  institution,  and  it  com- 
mands my  highest  loyalty  and  respect. 

2.  I am  neither  for  nor  against  the  church,  but  I do  not 
believe  that  church  going  will  do  anyone  any  harm# 

3.  I feel  the  good  done  by  the  church  is  not  worth  the 
money  and  energy  spent  on  it. 

4.  I regard  the  church  as  a monument  to  human  ignorance. 

5.  I believe  that  the  church  is  losing  ground  as  education 

advances. 

These  statements  were  each  given  a scale  value  by  the 
collective  judgements  of  three  hundred  people,  so  in  scoring, 
the  subject  is  placed  on  this  scale  by  taking  the  average 
of  the  scale  values  of  the  statements  which  he  checked.  To 
illustrate,  the  scale  values  of  the  above  five  statements 


are  as  follows: 
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statement  number  1.  - 

2.  - 

3.  - 

4.  - 

5.  - 


Scale  value  2 

..  ,»  13 

" " 20 

" 2 25 

" " 18 


Any  one  whose  average  is  0to4  is  strongly  favorable  to  the 
church,  from  5 to  8 is  favorable,  from  9 to  11  is  favorable 
with  reservation,  from  12  to  14  is  slightly  favorable,  from 
14  to  17  is  antagonistic,  and  from  17  to  24  is  strongly 
antagonistic.  The  scale  is  useful  then,  for  classifying 
individuals  according  to  whether  they  are  favorable  or  un- 
favorable in  their  attitude  toward  the  church,  with  some 
indication  as  to  the  amount  or  degree  of  the  quality  which 
they  manifest. 

To  the  unitiated,  the  attitude  tests  may  seem  to  fall 
so  far  short  of  perfectly  measuring  attitude  that  they  are 
apt  to  be  discarded,  unless  it  is  understood  that  these  at- 
tempts are  so  recent,  and  the  trait  to  be  measured  so  com- 
plex, that  refinement  can  only  come  through  continued  ex- 
periment and  use.  Sven  in  the  present  stage,  if  carefully 
interpreted  and  used,  some  of  these  tests  measure  much 
better  than  one  could  guess,  and  so  are  of  that  much  value. 

The  third  classification  of  tests  now  in  the  field 
of  education,  are  the  tests  of  character  or  the  control  of 
conduct.  Tests  of  character  are  apt  to  be  either  tests  of 
attitude,  or  tests  of  controled  responses  in  specific 
situations.  Hartshorne  and  May,  in  their  recent  and 
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valuable  studies*  use  the  phrase*  "The  conscious  control  of  cun- 
duct,"  This  is  the  result  which  all  Bible  teachers  either  conscious- 
ly or  unconsciously  want. 

One  of  the  first  attempts  at  conduct  measurement  was  attempt- 
ed by  Paul  Voelker  in  developing  the  ideal  of  trustworthiness*^ ’ 
While  Voelker  made  a significant  contribution  by  poineering  in  this 
field,  he  rather  assumed  the  validity  of  his  tests.  The  outstand- 
ing work  in  this  line  has  been  done  by  the  Character  Education 
Inquiry  under  the  direction  of  Hugh  Hartshorne  and  Hark  May  in 
collaboration  with  Teacher’s  College  Columbia  and  financed  largely 
by  the  Institute  of  Social  and  Religious  Research.  The  results  have 
been  published  in  three  volumes,  "Studies  In  Deceit",  "Studies  in 
Service  and  Self  Control"  and  "Studies  in  the  Organization  of 
Character" . 

Most  of  the  techniques  devised  and  used  by  the  Inquiry 
are  arranged  for  class  room  situations,  although  there  are  some 
parlor  games  and  athletic  contests  included  in  the  battery.  It 
would  be  impossible  to  describe  here  all  of  the  techniques  used, 
but  a small  sampling  is  given  to  show  the  nature  of  the  tests.  It 
is  well  to  remember  that  the  Inquiry  does  not  attempt  to  measure 
a large  inclusive  trait  in  one  test  for  they  regard  any  so  called 
character  trait  as  a loosely  organized  bundle  of  specific  habits; 
consequently  when  they  attempt  to  measure  the  character  trait  of 
deceit  they  turn  to  specific  situations  and  specific  habits#  and 
measure  these  through  a large  battery  of  tests.  In  measuring  deceit 
they  never  attempted  to  measure  more  than  one  phase  of  the  tendency 
to  deceive  at  one  time,  such  as  the  stealing  type  of  deception, 
and  even  then  several  devices  were  constructed  to 


1.  Voelker,  P.P.  "The  Function  of  Ideals  in  Social  Education’ 
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measure  this  one  phase  of  deceit.  In  measuring  service  tend- 
encies, they  tried  to  measure  cooperativeness,  self-sacrifice 
participation  in  social  service  activies,  etc.,  as  component 
elements  of  service.  In  measuring  the  tendencies  of  self- 
control  they  tried  to  discover  the  amount  of  inhibition  and 
persistence  a child  exhibits  under  certain  conditions.  A few 
of  the  techniques  used  by  them  in  these  studies  are  des- 
cribed here: 

The  Duplicating  Technique  for  measuring  the  Cheating  type  of 

1. 

Deceptive  Behavior:  A rather  common  form  of  classroom 

deceptiveness  occurs  when  the  pupil  makes  illegitimate  use 
of  a key  or  answer  sheet  either  in  doing  his  work  or  in  the 
scoring  of  his  own  test  paper.  There  are  two  ways  of  hand- 
ling this  situation;  one  is  called  the  duplicating  technique. 
Any  sort  of  test  is  given,  preferably  the  short  answer  type. 
The  papers  are  collected  and  taken  to  the  office,  where  a 
duplicate  is  made  of  each  paper.  Great  care  is  taken  to  be 
certain  that  an  exact  record  is  made  of  what  the  pupil  act- 
ually did  on  the  test.  At  a later  session  of  the  class,  the 
papers  are  returned  and  each  child  is  given  a key  or  answer 
sheet  and  is  asked  to  score  his  own  paper.  The  self-scored 
papers  are  then  compared  with  the  duplicates  and  all  changes 
are  recorded.  Deception  consists  in  illigitimately  increas- 
ing one’s  score  by  copying  answers  from  the  key. 

2. 

The  Improbable  Achievement  Technique : This  consists  in  giving 

1. Hartshorne  and  May,  "Studies  In  Deceit"  pages  51-52 

2.  Ibid  pages  55,56 
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a test  under  conditions  such  that  achievement  above  a cer- 
tain level  will  indicate  deception.  Chance  alone  will  pro- 
vide a certain  amount  of  success , but  achievement  beyond  a 
certain  point  is  evidence  of  deception  by  peeping.  This 
technique  was  used  with  two  kinds  of  tests,  one  requiring 
the  use  of  paper  and  pencil  and  the  other  the  use  of  objects 
like  puzzles  or  games.  There  are  certain  kinds  of  mechanical 
puzzles  which  may  be  effectively  used.  The  puzzle  must  appear 
simple  but  in  reality  be  very  difficult.  It  must  require  gen- 
uine skill  rather  than  the  knowledge  of  a secret  trick  or 
principle.  It  must  be  of  such  a nature  that  the  dishonest 
pupil  can  fake  a solution  or  appear  to  have  solved  it  when 
he  really  did  not.  It  was  a difficult  problem  to  find  puzzles 
that  could  be  purchased  which  were  not  being  sold  in  the  toy 
stores  at  the  time.  One  of  the  puzzles  used  was  the  Fifteen 
Puzzle.  It  consists  simply  of  a small  box  four  inches  square 
with  sixteen  blokks  each  one  inch  square  and  numbered  zero 
to  fifteen.  The  small  squares  are  made  of  wood.  They  were 
arranged  in  a standardized  chance  order  as  follows: 

10  8 5 13 

15  6 2 3 

9 11  12  0 

14  1 7 4 

The  problem  is  to  remove  the  one  marked  zero  (0)  and  then  by 
sliding  the  others  around  get  them  in  this  orders 
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12  3 4 

5 6 7 8 

9 10  11  12 

13  14  15 

It  is  strictly  forbidden  to  remove  any  block  from  the  board. 
The  puzzle  must  be  solved  by  sliding  the  blocks  around.  Five 
to  eight  minutes  were  allowed.  Here  cheating  consists  in 
taking  the  blocks  out  and  placing  them  in  the  correct  order 
without  playing  the  game.  In  the  places  provided  on  the 
score  sheet  the  child  makes  a record  of  the  numbers  on  the 
squares  as  they  appeared  when  time  was  called.  To  obtain 
the  amount  score,  the  rows  across  the  square  were  taken  as 


units  and  weighted  thus: 

1st  row  correct,  1 2 3 4 1 credit 

2nd  row  correct,  5 6 7 8 2 credits 

3rd  row  correct,  9 10  11  12  . .3  credtis 

4th  row  correct,  13  14  15  4 credits 

Maximum  score  • 10  credits 


Before  a pupil  was  marked  Mc"  ( cheated)  or  negative,  he 

1 ( 

must  have  scored  the  maximum.  '' 

Measuring  cheating  as  exhibited  in  Parlor  Games  is  illus- 
trated by  the  game  called  Bean  Relay.  This  is  a modified 
potato  race,  using  beans  instead  of  potatoes  , which  was 
developed  for  this  purpose  after  considerable  experiment- 
ation. Each  row  has  four  boxes,  the  first  empty,  the  sec- 
ond and  third  with  three  beans  each,  and  the  fourth  with 


1.  Ho  pupil  could  rearrange  these  pages  according  to  rules  in 

the  time  allowed.  So  if  the  pages  were  arranged  properly,  then 

the  pupil  must  have  cheated. 
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ten  or  more.  If  possible  five  rows  are  run  at  once  to  give 
the  atmosphere  of  contest.  Each  heat  is  thirty  seconds, 
which  is  time  enough  for  eight  or  more  runs,  that  is,  for 
eight  or  more  chances  to  cheat.  The  rule  is  to  pick  up  one 
bean  at  a time.  The  number  of  beans  in  the  home  box  should 
correspond  with  the  number  of  runs  which  has  been  checked 

by  the  observer.  If  a child  is  found  to  have  more  beans  than 

1. 

runs,  this  is  evidence  of  deception  and  he  is  given  a WC" • 

Cooperativeness 1 and  self-sacrifice  are  elements  of 

2. 

Service.  The  Money  Voting  test  is  one  of  the  devices 
used  for  measuring  these  elements.  Learning  exercises 
were  given  and  after  they  were  finished,  the  teacher  sug- 
gested the  possibility  that  since  this  was  a contest  and 
different  classes  were  competing  that  the  class  might  win 
the  prize.  A straw  vote  was  proposed  in  advance  of  any 
discussion,  to  see  how  the  class  would  dispose  of  the 
money  if  it  should  be  won.  The  ballot  was  as  follows: 

"Write  your  name  on  the  slip  and  then  mark  1 what  you  would 
most  like  to  do  with  the  money.  Mark  2 your  second  choice. 
Mark  3 your  third  choice.  Mark  4 your  fourth  choice.  Mark 
5 your  fifth  choice. 

Mam  e 

(4)  Give  all  the  money  to  the  boy  or  girl  scoring 

highest  in  the  test. 

(2)  Buy  something  for  our  school,  such  as  bats,  balls, 

l.Ibid  page  88 

2.  Studies  in  Service  and  Self  Control,  Hartshorne  and  May 
pages  56-57 


. , 

, 

, 

. ■ 

•• 

. 

. 

... 


, 


. 


- 


» 

. 


. 


. 


■ 


59 


(3)  Buy  something  for  the  room,  such  as  a picture,  a 

globe  of  goldfish,  some  plants, 

(5)  Divide  the  money  equally  among  the  members  of  the 

class, 

(1)  Buy  something  for  some  hospital  child  or  some  family 

needing  help  or  for  some  other  philanthrophy . 

A theoretically  correct  arrangement  of  the  items  on  the 
ballot  was  determined  partly  by  reference  to  the  way  the 
children  actually  voted  and  partly  by  the  judgement  of  the 
staff  of  the  Inquiry  as  to  the  relative  social  significance 
of  each  choice.  This  "correct"  method  of  voting  or  ranking 
is  indicated  in  brackets  before  each  statement  given  above. 
These  were  omitted  as  the  suggestions  were  given  to  the 
children.  If  a pupil,  in  voting  his  choice  of  alternatives, 
gave  an  item  the  rank  shown  in  the  key,  this  item  was 
scored  2.  If  it  was  ranked  one  place  out  of  position,  it  was 
sometimes  dropped  to  the  value  of  one  and  sometimes  the 
value  of  2 was  still  given.  But  if  it  was  more  than  two  steps 
removed  from  its  correct  rank,  it  was  usually  marked  0 . The 
key  on  the  ballot  shows  that  the  fifth  alternative,  to  buy 
something  for  someone  in  need,  was  rated  highest.  Hence  if 
the  subject  rated  it  one,  he  received  a score  of  2.  If  he 
made  this  his  second  choice  he  received  a score  of  1.  The 
test  gave  very  interesting  results. 

In  measuring  persistence,  an  element  in  self-control 
an  appeal  was  made  to  the  subject’s  curiosity  to  see  how  an 
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an  incomplete  story  really  ended.  These  tests  were  called 
the  Story  Resistance, or  S-R, tests.  Three  stories  are  used, 
two  of  escape  from  danger  and  one  of  adventure.  The  first 
story  is  read  to  the  class  up  to  the  climax,  and  the  ending 
is  supplied  written  in  such  a way  that  it  is  very  difficult 
to  read  - the  difficulty  increasing  as  the  end  i3  approached. 
The  child  is  given  the  option  of  finishing  the  story  or 
"beginning  a new  one*  which  ends  in  the  same  difficult  way. 
Again  he  is  given  the  option  of  finding  how  this  story  ends 
or  beginning  a new  one.  Three  degrees  of  reading  difficulty 
are  provided  in  the  ending  to  each  story.  The  first  level 
consists  in  capital  letters  run  together  with  no  spaces  be- 
tween words,  as  thisCHARLSSLIFTEDLUCILLE.  A second  order  of 
difficulty  is  secured  by  mixing  small  and  capital  letters. 

A third  order  of  difficulty  is  readhed  by  using  both  small 
and  capital  letters  with  spaces  coming  at  irregular  places 
rather  than  at  the  endings  of  words.  The  instructions  are  to 
place  a vertical  line  between  the  letters  to  show  where 
one  word  ends  and  the  next  one  begins.  This  makes  reading 
somewhat  easier  and  also  enables  the  examiner  to  determine 
how  far  each  child  has  read.  A uniform  time  of  three 
minutes  was  allowed  on  a practise  exercise  first.  The  score 
on  this  was  the  number  of  words  of  text  spaced  off  in  the 
way  just  described.  "When  the  end  of  the  story  was  presented, 
the  length  of  time  each  child  worked  on  it  was  noted  at  the 
time,  and  later  the  number  of  words  he  spaced  off  was  counted. 
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The  score  was  the  number  of  words  spaced  off  in  the  story 
divided  by  the  number  of  words  spaced  off  in  the  practise 
period  of  three  minutes.  It  is  an  approximate  measure  of 
the  time  (in  three  minute  units)  the  pupil  really  worked 
on  the  story. 

A very  few  of  the  techniques  used  in  conduct  measure- 
ment have  been  described,  but  the  fact  that  so  many  specific 
measurements  must  be  made  of  the  many  traits  of  conduct 
means  the  construction  and  giving  of  a great  number  of  tests 
in  a single  battery  before  it  can  really  hope  to  measure 
even  one  phase  of  a general  trait  such  as  self-control  or 
service.  Forty-one  temptations*  or  samplings  of  behavior* 
were  necessary  in  testing  one  phase  of  inhibition  alone. 

The  necessity  for  so  great  a number  of  tests  constitutes 
one  of  the  problems  which  faces  anyone  who  attempts  to  test 
conduct*  and  this  description  of  a few  of  these  tests  and 
techniques  is  only  to  show  the  nature  of  the  tests  which 
are  now  being  used  in  experimentation. 

An  entirely  different  approach  to  the  measurement  of 

1. 

character-including  conduct  or  habits  of  conduct  - is  from 

the  standpoint  of  individual  rating  scales.  As  objective 

a scale  as  possible  is  made  by  selecting  a person  known 

to  the  rater  as  ranking  extremely  high  in  the  quality 

being  measured;  then  selecting  one  ranking  extremely  low. 

Others  in  the  group  are  rated  in  terms  of  these  well  known 

1. Character  is  included  as  one  of  the  outcomes  of  Bible 
Teaching 
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MY  CHRISTIAN  QUEST 
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individuals.  Often  there  are  four  or  five  on  the  scale  by 
whom  the  members  of  one’s  class  or  group  are  measured.  The 
value  of  such  a scale  is  that  it  does  offer  something  object- 
ive by  which  to  measure,  but  the  problems  of  subjective 
estimates  and  of  getting  people  on  the  scale  who  are  known 
to  all,  or  to  several  people  who  would  use  the  scale,  are  ► 
still  unsolved. 

The  Christian  Ouest  "Five  Point  Scale  of  Individual 
Growth"  is  for  individual  use.  The  leader  and  subject,  (pro- 
bably some  young  person)  meet  together  and  talk  trough 
the  scale  and  the  problems  which  arise  in  marking  it.  The 
subject  may  then  mark  the  scale  himself,  or  do  it  with  the 
help  of  the  leader.  A profile  is  drawn  from  the  marking, 
and  after  a period  of  time,  the  scale  is  remarked  and  a new 
profile  drawn.  The  profiles  are  then  compared  to  note  the 
growth.  A copy  of  the  scale  follows: 
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Prepared  by  the  Committee  on  Religious  Education  of  Youth 
Approved  for  experimental  use,  February,  1927,  by  the  International  Council 
of  Religious  Education 
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MY  INDIVIDUAL  ACTIVITIES 

In  the  pamphlet  “ Program  Suggestions  for  Group  Leaders ” there  are  listed  several  hundred 
activities  in  which  youth  can  engage.  ^ The  owner  of  this  chart  will  choose  and  list 
here  individual  activities  and  then  check  them  off  as  they  are  completed. 
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11.  SPECIALIZED  RELIGIOUS  ACTIVITIES 
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FIVE-POINT  SCALE  OF  INDIVIDUAL  GROWTH 
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In  summarizing  section  III  let  me  restate  that  the 
problem  of  getting  acquainted  with  what  is  being  done,  and 
has  already  been  done  in  the  nature  of  testing  in  the  Field 
of  Religious  Education,  faces  anyone  who  attempts  to  measure 
the  outcomes  of  Bible  Teaching,  Many  of  the  tests  already 
available  are  better  than  one  person  or  one  school  by  itself 
could  devise  as  the  needs(for  tests  arise.  Although  tests  to 
meet  local  situations  and  individual  class  instruction 
must  be  devised,  yet  the  knowledge  of  what  has  been  done 
in  the  field  is  essential.  This  section  of  this  paper 
has  shown  the  advantages  of  objective  testing!,  and  recog- 
nizes the  limitations  of  the  tests  which  are  now  avail- 
able, as  well  as  their  valuer  both  for  use,  and  for  sug- 
gestions in  helping  others  to  devise  tests. 

IV  Some  of  the  Problems  Involved  in  the  Construction  of 
Tests. 

Most  Bible  Teachers  will  need  to  know  something  of 
test  construction,  for  although  the  best  tests  will  probably 
be  made  by  the  experts  and  leading  educational  institutions, 
yet  those  who  expect  to  use  the  tests  made  by  others  and  to 
make  those  they  need  in  their  local  situations  to 

measure  their  own  teaching  objectives  will  often  be  faced 
with  the  problem  of  constructing  a measuring  instrument 
to  meet  their  own  needs.  Such  people  as  Watson,  Ruch, 
Hartshorne  and  May,  Thurstone  and  Chave,  and  Monroe  have 
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written  very  elaborate  discussions  on  how  to  construct  good 
t4sts.  Any  one  who  expects  to  construct  tests  will  need  to 
be  familiar  with  these  extensive  works.  It  is  not  the  purpose 
here  to  tell  how  to  construct  a test,  nor  to  discuss  all  the 
problems  involved  in  test  construction,  ; but  merely  to 
enumerate  some  of  the  problems  involved  in  the  making  of 
tests. 

The  first  problem  will  be  the  determining  of  what 
one  is  going  to  measure.  What  one  measures  is  determined  by 

his  objectives,  his  teaching,  and  his  pupils,  or  as  Professor 

1. 

Watson  states  it,  "In  test  construction,  as  with  almost 
everything  else  in  experimentation,  the  motto  should  be 
"Purpose  first".  Every  step  in  the  total  process  depends 

upon  what  the  test  is  supposed  to  do,  and  the  groups  in  which 

2. 

it  is  expected  to  do  it."  Professor  Ruch  adds  to  this, 

"The  tearts  must  parallel  the  actual  teaching...  any  test 
must  represent  an  extensive  sampling  of  the  materials  of 
instruction."  All  of  this  takes  us  back  to  our  objectives 
and  proves  what  was  at  first  stated,  that  measurement  demands 
a goal  or  objective.  The  teaching  material  is  also  chosen  in 
accordance  with  the  results  which  one  hopes  to  achieve, 
and  of  course  the  results  or  objectives  are  chosen  on  the 

basis  of  the  pupil* s need.  The  testing,  then,  must  be 

by  a series  of  tests  for 

comprehensive  and  cover  this  f i eld ^ although  it  is  true 
that  some  one  test  may  cover  only  one  small  phase  of  this 


1.  Watson,  Goodwin  B.  "Experimentation  And  Measurement"  p.127 

2.  Ruch, G.M. "The  Objective  or  New  Type  Examination"  page  30 


, . : 

- 


* 

. 

, , 

■ 

■ 

» 

. 

- ,v 

■ 

. 

' 

. • . ' " ; 


* 

. 


field*  The  worth-whileness  of  this  phase,  and  its  import- 
ance to  the  total  result  hoped  formust  he  considered  in 
determining  what  one  measures.  It  may  be  necessary  to 
limit  or  define  in  terms  of  certain  unitary  situations 
under  specified  conditions,  especially  if  one  is  measuring 
character  traits  of  conduct;  and  again  it  may  be  necessary 
to  expand  the  field  and  sample  carefully  the  most  important 
phases,  especially  if  one  is  testing  the  knowledge  of  the 
contents  of  a book  or  subject,  such  as  the  gospels,  or 
the  life  of  Jesus.  It  will  be  a problem  to  make  the  test 
measure  the  one  thing  it  sets  out  to  measure  and  avoid  the 
interesting  by-paths  of  irrelevant  material  or  opinions. 

The  next  problem  will  likely  be  to  determine  the 
technique  that  will  best  measure  the  thing  one  sets  out 
to  measure.  In  section  III  it  was  discovered  that  the  thing 
one  wants  to  measure  determines  the  type  of  test  that  will 
be  used.  If  one  is  testing  attitudes  his  techniques  will 
differ  somewhat  from  those  he  will  use  in  testing  knowledge 
of  facts.  If  he  is  testing  conduct,  the  techniques  will 
differ  still  more.  But  even  within  these  three  realms, 
the  techniques  will  vary  according  to  the  adaptability 
of  the  technique  to  the  testing  material.  There  is  no  Bet 
rule  for  deciding  the  technique  best  adapted,  but  exper- 
iment and  experience  will  help  one  to  learn  which  one 
to  use.  It  is  often  a good  plan  to  try  out  several 
techniques  and  see  how  well  they  adapt  themselves  to  the 
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material  at  hand.  Sometimes  one  will  be  selected,  and  some- 
times several  may  be  chosen  and  a battery  of  tests  made  up 
to  get  the  fina.l  score. 

Then  comes  the  procedure  of  building  a test.  This 
procedure  will  differ  very  much  according  to  the  kind  of 
test  one  is  building.  If  the  test  material  is  concerned 
only  with  factual  knowledge  and  ideas,  and  the  ability  to 
use  these  ideas  or  facts,  the  procedure  of  building  a test 
will  be  partially  determined,  and  it  will  be  somewhat  the 
same  for  all  tests  which  measure  knowledge.  If  the  teat 
is  to  measure  attitudes  or  conduct,  then  the  procedures 

differ.  Professor  Ruch  has  given  a rather  comprehensive 

1. 

outline  for  the  construction  of  knowledge  tests.  Follow- 
ing his  outline,  the  first  step  is  drawing  up  a table  of 
specifications.  This  is  merely  an  outline  of  the  essential 
items,  arranged  with  less  essential  or  subordinate  ones 
as  sub-heads.  The  whole  outline  of  material  must  be  so 
arranged  that  the  relative  importance  of  all  the  material 
may  be  kept  in  mind  in  building  the  test.  Next,  he 
suggests  drafting  the  items  in  preliminary  form,  accord- 
ing to  the  technique  which  has  been  chosen  and  selecting 
enough  items  to  make  possible  an  elimination  of  the 
poorer  ones,  and  a selection  of  the  best.  The  length  of 
the  test  is  then  decided  on  the  basis  of  the  number  of  the 
worthwhile  items  that  could  be  made,  and  the  number  needed 
to  cover  the  subject  thoroughly,  or  to  present  a sampling 
fair  enough  and  large  enough  to  give  a reliable  estimate 


1*  Ruch,  G.M.  "The  Objective  or  New  Type  Examination  "P. 149-187 
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of  the  pupil* 8 ability.  Then  the  items  are  rated  for  diff- 
iculty. If  one  is  making  a standardized  test,  this  must 
be  done  objectively  and  statistically  through  the  exper- 
imental use  of  the  test  and  by  discovering  the  actual 
difficulty  as  found  in  the  scores  of  those  taking  the  test. 
This  is  often  impossible  for  an  individual  teacher  when 
she  is  making  a test  for  her  own  class.  If, however,  she 
improves  the  test  through  the  years  of  her  teaching  she 
can  discover  the  difficulty  by  keeping  her  data  and  using 
it  to  check  up  with  the  actual  scores  made  on  each  question. 

It  i3  better  to  put  easy  items  first  and  continue  to  place 

1. 

them  in  an  increasing  order  of  difficulty,  as  a pupil’s 
ability  is  better  determined  by  being  able  to  see  how  far 
into  the  test  he  is  able  to  work  in  the  given  time.  The 
test  should  require  as  little  reading  as  possible  and  as 
little  writing  as  is  practical.  It  takes  time  to  read  and 
write  and  since  the  test  is  made  for  the  purpose  of  dis- 
covering something^ sides  readihg  and  writing  abilities, 
t&eee  should  not  interfere  with  the  abilities  being 
measured.  Directions  should  be  as  carefully,  completely 
and  clearly  stated  as  possible,  and  yet  as  brief  as  is  con- 
sistent with  these  requirements.  The  test  should  pro- 
vide for  controling  the  conditions  under  which  the 
performance  is  given.  This  control  should  be  such  that  the 

same  tewting  conditions  will  be  approximated  by  different 

2. 

examiners  in  different  places  and  at  different  times. 

1. Ruch,  Cr.M.  The  Objective  or  New  Type  Examination^-Ch. 2 

2.  Monroe,  "The  Theory  of  Educational  Measurements,  p.  65 


. 

. 

- 

. 

. 

. 

, 

• *; 

, 

. 

- 

. 

- 

, 

• • 

, 


The  scoring  should  be  objective  so  that  the  score  will  be 
constant  regardless  of  who  scores  it.  vVhere  possible  the 
score  should  be  diagnostic  or  at  least  analytical.  It  should 
tell  more  than  that  a pupil  is  good  or  poor.  It  ought  to 
tell  how  good  or  how  poor,  and  where  he  is  good  or  poor. 

This  means  that  the  test  maker  has  the  problem  of  making  up 
his  statements  or  questions  so  that  he  has  questions  which 
indicate  different  degrees  of  difficulty  and  also  specified 

or  known  phases  of  difficulty.  Furthermore,  according  to 

1. 

Monroe  "The  test  should  provide  adequate  opportunity  for 
all  pupils  to  demonstrate  their  abilities  in  the  field  de- 
fined by  its  function,  and  it  should  make  possible  the  de- 
scription of  a pupil* s performance  in  terms  of  the  dimen- 
sions or  characteristics  that  are  significant."  The  mere 
statement  of  these  guiding  requirements  cannot  give  an 
adequate  idea  of  the  problems  involved  in  trying  to  carry 
them  out.  They  can  be  fully  realized  only  through  exper- 
ience. Of  course,  it  is  not  always  possible  to  adhere  to 
every  requirement,  but  the  more  valid  and  reliable  tests 
will  adhere  to  these  requirements.  Added  to  these  require- 
ments and  the  problems  involved  in  carrying  them  out  is 
the  problem  of  establishing  the  reliability  of  the  test 
and  obtaining  evidence  for  its  validity.  The  statist- 
ical problem  here  involved,  as  well  as  in  interpreting 
the  results  is  sufficient  to  baffle  the  unitiated 

l.Monroe,  W.S.  The  Theory  of  Educational  Measurements. 

Page  65 
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statistician  . 

Just  for  illustration  of  another  kind  of  problem  which 
faces  the  test-maker,  recall  some  of  the  quotations  from  the 
Union  Test  of  Religious  ideas  on  pages  35  to  37.  Protests 
were  raised  against  usin^hese  tests  on  the  grounds  that 
they  caused  irreverence  and  undesirable  criticism,  and 
suggested  doubts  rather  than  faith  in  the  children  who  were 
subjected  to  them.  For  instance,  in  order  to  discover  the 
kind  of  ideas  possessed  by  the  subject,  the  test  asked  the 
subject  to  indicate  whether  certain  prayers  were  good, 
fair  or  poor.  One  of  the  prayers  was  "Father,  forgive  them 
for  they  know  not  what  they  do".  I am  quoting  from  a pro- 
test circulated  by  a father  whose  children  were  asked  to 
take  the  test,  "We  think  the  prayer  which  fell  from  our 
Savior* s lips  in  that  holy  and  awful  place,  Calvary,  is  the 
most  sacred  that  ever  ascended  to  the  throne  of  God.  Is  it 
possible  to  conceive  of  a greatei  sacrilege,  a more  fearful 
violation  of  the  sanctities  of  human  thought  and  the  finest 
sensibilities  of  the  human  soul  than  to  have  a Theological 
Seminary  (where  the  test  v/as  made)  request  our  children  to 
read  that  holy  prayer  and  then  to  answer  whether  it  is 
Excellent,  Fair,  or  Poor.  " The  protest  continues  against 

other  parts  of  the  test.  Of  course,  the  test  is  no  longer 

of  this  protest 

printed  as  a result/^  but  the  problem  remains  as  to  how 
one  can  build  a test  to  test  all  kinds  and  degrees  of 
ideas  without  suggesting  the  entire  range. 
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The  problem  of  building  an  attitude  measuring  instrument 
is  somewhat  different,  although  some  of  the  same  requirements 
apply.  The  first  observation  in  this  field  is  that  the 
attitude  factors  are  much  more  complicated  thah  the  knowledge 
factors  which  were  relatively  simple.  In  the  first  place 
knowledge  enters  into,  and  is  a part  of  these  dynamic  factors. 

YJhat  one  knows  about  a thing  determines  to  some  extent  his 

attitude  toward  that  thing  or  his  interest  in  it.  How  large 

in  controling  conduct 

a part  knowledge  plays  Ais  a much  debated  question,  partly 
determined  by  one’s  philosophy  and  his  psychology  of  ed- 
ucation. There  is  a rather  general  acceptance  of  the  fact 
that  ideas  tend  to  find  expression  either  in  thought ^f eeling 
or  act.  But  it  is  not  the  knowledge  element  with  which  we 
are  most  concerned  now,  for  although  we  gralt  that  knowledge 
tends  to  find  expression  in  feelings  and  acts,  yet  one 
may  have  knowledge  which  does  not  function  in  either 

attitudes  or  conduct.  One  may  know,  for  example,  that  over-indulgence 
in 

^sweets  lessen  physical  endurance,  but  he  may  continue  to 
eat  sweets  and  to  have  an  indifferent  attitude  toward  this 
bit  of  knowledge  until  he  wants  to  play  football.  Then 
his  desire  to  be  a football  player  and  to  get  on  the  team 
may  cause  him  - for  the  time  being  at  least  - to  set  up 
for  himself  a higher  appreciation  of  that  bit  of 
information  of  which  formerly  he  was  so  indifferent.  His 
emotions  a>nd  feelings  brought  into  play  in  this  exper- 


: ) 


, 


. 


- 

' 

, . - , . f ' • • • 

- 


, Jff 

-• 


ience  may  even  cause  him  to  react  against  sweets  so  that  he 
does  not  really  care  for  them.(  More  possible  than  probable*) 
Knowledge  was  necessary  - knowledge  of  the  fact  - but  it 
did  not  change  his  fondness  for  sweets  nor  cause  him  to 
give them  up  when  he  was  fond  of  them.  If  we  are  to  be  suc- 
cessful Bible  teachers  or  relgious  educators  we  hope  to 
change  or  properly  emotionalize  ideas  30  that  they  will 
in  turn  motivate  conduct.  The  dividing  lines  between 
knowledge  and  attitudes,  and  between  attitudes  and  control 
of  conduct  are  not  sharply  defined,  so  when  one  comes  to 
measuring  in  these  field.s,  he  will  find  some  repetition 
as  to  technique. 

It  has  already  been  oointed  out  that  the  ranking 

[page  41] 

technique^seems  to  have  value  in  discovering  a subject rs 

opinion  as  to  what  he  thinks  best  to  do  in  certain  situations. 

It  is  a question,  again,  whether  one’s  opinion  as  given  in 

a test  is  truly  his  own  opinion  or  whether  he  gives  what 
the 

he  knows  to  be^accepted  standard  when  his  knowledge  of 
what  is  right  is  better  than  his  own  feeling  or  acting 
in  the  matter.  Tests  in  the  realm  of  attitudes  and  conduct 
break  down  to  some  extent  at  this  point.  It  is  often  a 
question  whether  the  tests  give  anything  like  a true 
index,  of  the  individual's  own  attitudes  and  conduct,  but 
still  educators  believe  that  while  these  tests  do  not 
measure  perfectly  or  ideally,  yet  some  of  them  give  a 
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better  index,  when  carefully  given  and  interpreted,  than 
chance  guessing  would  do.  They  have  proven  especially  useful 
in  analyzing  and  in  helping  young  people  to  analyze  their 
own  attitudes  and  conduct  as  illustrated  in  The  Christian 
Quest  Chart  of  Individual  Growth, 

Thurstone  and  Chave,  in  "The  Measurement  of  Attitude” 
have  attempted  to  define  the  concept  "attitude”  and  to  defend 

the  using  of  "Opinion”  as  an  index  of  attitude.  A brief 

1. 

review  of  their  discussion  is  signigicant  here.  The  concept 
"attitude”  is  used  by  them  "to  denote  the  sum  total  of  a 
man’s  inclinations  and  feelings,  or  prejudice  or  bias, 
preconcieved  notions,  ideas,  fears,  threats,  and  convictions 
about  any  specific  topic”.  The  concept  "opinion”  is  used 
as  a verbal  expression  of  attitude.  If  a man  3ays  we  made 
a mistake  in  entering  the  war  with  Germany,  that  statement 
would  be  called  his  opinion.  The  term  "opinion”  is  restricted 
to  verbal  expression,  but  it  is  supposedly  an  expression 

t 

of  the  speaker’s  attitude.  Lt  is  on  this  assumption  that 
Thurstone  and  Chave  have  constructed  a scale  for  measuring 
attitude  toward  the  church,  which  is  a significant  piece 
of  work  in  attitude  measurement.  Their  procedure  in  build- 
ing this  scale  will  be  briefly  described  in  order  to  show 
up  some  of  the  other  problems  with  which  they  were  faced. 

These  same  problems  are  common  to  the  building  of  many 
attitude  scales,  although  there  are  other  methods  of 
measurement  involving  still  other  kinds  of  problems. 


1.  Thurstone  and  Chave, "The  Measurement  of  Attitude"  p 66  ff 
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After  deciding  to  measure  a subjects  attitude  as 
expressed  by  his  acceptance  or  rejection  of  opinions,  and 
after  deciding  that  the  thing  they  would  try  to  measure 
was  his  attitude  toward  the  church  they  began  to  collect 
people’ s opinions  regarding  the  church.  This  necessitated 
making  some  sort  of  a form  or  questionaire  to  ask  for  the 
opinions,  and  to  give  the  subject  a motive  or  desire 
for  giving  their  opinions  which  was  consistent  with  the  truth 
There  were  then  five  criteria  which  they  applied  in  editing 
these  statements  and  making  them  useable  for  their  scale. 

1.  The  statements  should  be  brief.  2. They  should  be  such 
that  they  could  be  indorsed  or  rejected  in  accordance  with 
their  agreement  or  disagreement  with  the  attitude  of  the 
reader.  3.  Every  statement  should  indicate  something  re- 
garding the  reader* s attitude  about  the  issue  in  question. 

4.  Double-barreled  questions  should  as  a rule  be  avoided. 

5.  It  is  necessary  that  at  least  a fair  majority  of  the 
statements  really  belong  on  the  attitude  variable  that  is 
to  be  measure!. 

After  the  statements  were  secured  and  edited, 
they  had  to  be  sorted  into  eleven  piles  to  represent 
evenly  graduated  series  of  attitudes  from  those  extremely 
against  the  church  to  those  which  are  'very  much  in  favor 
of  the  church.  So  that  this  sorting  would  be  most  object- 
ive, or  at  least  be  a reliable  sorting,  300  people  were 
asked  to  sort  the  statements  not  according  to  their  own 
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opinions  or  attitudes,  but  according  to  their  judgements 
as  to  whether  the  statements  were  favorable  or  unfavorable 
to  the  church.  This  process  is  a long  and  tedious  one.  The 
directions  must  be  mimeographed,  as  well  as  all  the  state- 
ments; they  must  be  given  out  to  the  three  huifldred  people. 
Then  the  results  of  their  tabulations  must  be  recorded 
and  sumarized  in  such  a way  as  to  show  for  each  subject  the 
pile  in  which  he  placed  every  one  of  the  130  statements. 
This  data  in  turn  was  assembled  into  a table  showing  the 
frequencies  for  each  statement  in  the  separate  piles.  From 
$his  data  the  median  scale  value  was  determined  graphically 
as  well  as  the  spread  of  the  middle  fifty  percent  of  the 
sortings.  This  gave  the  scale  value  of  a statement  and  its 

ambiguity,  so  that  one  could  see  to  what  extent  it  meant 
different  things  to  different  people.  The  reliability  of 
this  scale  value  was  then  determined  statistically. 

. Beside  the  objective  criterion  of  aiobiguity,  an 
objective  criterion  of  irrelevance  was  devised  to  check 
against  the  procedures.  This  criterion  was  concerned  with 
the  records  of  actual  votes.  The  130  statements  were 
mimeographed  and  given  out  to  three  hundred  subje  eta 
to  check  those  statements  with  which  they  agreed,  that  is, 
those  which  represented  their  own  convictions.  These 
returns  could  be  studied  for  internal  consistency,  which 
in  this  case  was  attributed  to  the  defects  of  the  state- 
ments so  that  it  proved  of  a further  help  in  eliminating 


’ 

. 

■ 

. 

* 


. 

V- 

• 

• 

■ 

. 

S 

* 

■■ 


the  unsuitable  statements  from  the  scale.  This  was  done  by 
statistically  determining  an  Index  of  Similiarity  and  com- 
paring every  statement  with  every  other  statement  to  see  if 
all  the  statements  having  approximately  the  same  scale 
value  would  be  indorsed  similiarly  by  people  whose  rating 
on  the  scale  was  similiar. 

A final  list  of  forty-five  statements  was  selected 
from  the  original  list  of  130  opinions.  The  selection  was 
made  with  consideration  of  the  criterion  of  ambiguity, 
the  criterion  of  irrelevance,  the  scale  values,  and  by 
inspection  of  the  statements.  The  statements  were  so 
selected  that  they  constituted  a more  or  less  uniformly 
graduated  series  of  scale  values.  The  scale  on  which 
the  statements  were  placed  by  the  sorting  is  a linear 
scale  representing  a scale  from  extreme  favorableness 
toward  the  church  to  extreme  antagonism  toward  the  church. 

In  its  final  form  this  set  of  statements  was 
given  oj-t  to  several  hundred  subjects  for  actual  voting 
with  an  inserted  blank  asking  for  other  information 
with  which  the  scores  on  the  attitude  scale  could  be 
correlated.  The  objectiveness  of  such  an  instrument  and 
its  practical  nature  and  worthwhileness  makes  it  of 
value  to  the  religious  educator. 

Most  of  us  will  only  use  these  scales,  but  some 
in  the  field  of  religious  education  must  face  the  problems 
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of  constructing  them,  which, briefly  summarized,  are  these: 
Problems  of  determining  the  scale  or  type  of  scale  which 
can  measure  attitudes  - Can  attitude  be  measured  on  a 
linear  scale?  If  so,  it  is  limited  to  that  phase  of 
attitude  which  can  be  spoken  of  as  "more"  or  "less".  Then 
what  kind  of  a scale-line  shall  be  used  to  show  this 
"more"  or  "less"  quality?  In  this  case  it  v/as  a base  line 
of  equally  appearing  intervals  indicating  more  or  less 
f avorableness  toward  the  church.  Next  came  the  problem  of 
selecting  the  content  of  the  test.  It  was  not  selected  by 
someone  sitting  down  and  making  good  use  of  his  imagination, 
but  by  getting  an  expression  of  the  viewpoints  of  many  people 
and  thereby  getting  content  vital  to  current  thinking  and 
life.  Justification  for  the  use  of  expressed  opinions  as 
partially  representing  an  attitude  had  to  be  demonstrated. 

The  next  problem  was  forming  these  test  elements  by  subject- 
ing them  to  certain  criteria,  and  then  presenting  this 
material  in  the  form  of  statements  to  be  sorted,  and  by 
this  method  attempting  to  solve  the  problems  of  ambiguity 
and  find  the  scale  value  objectively.  Getting  large 
numbers  of  people  of  varying  opinions  to  give  their 
opinions,  getting  the  three  hundfed  judges,  the  several 
hundred  subjects  for  experimental  use  constitutes  a real 
problem.  Getting  an  objedtive  criterion  by  which  tc  est 


the  test  is  always  a problem  in  test  construction.  Then  mak- 
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ing  correlations  with,  other  important  data  to  find  to  what 
extent  other  factors  determine  or  influence  the  rating  by 
this  scale*  One  has  always  to  face  the  problem  to  what 
extent  intelligence  affects  a scale  rating  as  well  as 
the  influence  of  other  factors. 

The  first  problem  one  faces  in  the  measurement 

of  conduct,  is  the  nature  of  conduct  itself.  Is  conduct  to 
purely 

be  summed  up  as  a;  mechanistic  reactions  to  stimuli?  To 
what  extent  is  it  controled  or  motivated  by  ideals?  The 
answer  to  these  questions  will  determine  how  one  shall  go 
about  measuring  the  extent  -fco  which  an  individual  be- 
haves morally.  It  has  already  been  stated  that  the  out- 
standing work  in  the  measurement  of  conduct  has  been  done 
by  Hartshorne  and  May.  They  have  classified  the  problems 
of  testing  conduct  into  (l)  social  behavior  - the  per- 
formance factors,  and  (2)  self-control  - the  relation  of 
all  the  factors  to  one  another  and  to  social  self-integrat- 
ion. A paragraph  quoted  from  Hartshorne  and  May'*’*  will 
show  the  way  in  which  conduct  may  be  regarded.  " Studies 
in  Deceit  includes  as  much  as  possible  of  the  subtle  inter- 
relations between  behavior  and  knowledge,  opinion  and 
attitude,  but  is  incomplete  at  this  point  in  view  of 
the  practical  limitations  of  a single  volume  and  the 
need  of  concentrating  in  a subsequent  section  the  various 
problems  related  to  the  organization  of  the  self  which  it 


1.  Hartshorne  and  May,  "Studies  in  Deceit"  page  11 
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has  been  possible  to  study  in  the  course  of  the  Inquiry. 

The  reader  is  asked, therefore , to  remember  that  while  it 
is  necessary  to  study  deceit  objectively  as  behavior 
which  can  be  observed  and  measured,  there  is  in  this  no 
implication  that  this  behavior,  apart  from  its  causes, 
consequences,  and  other  concomitants,  has  any  significance. 
We  hold  no  brief  for  absolutism  in  morality  or  psychology, 
which  in  either  case  tends  to  attribute  to  isolated  acts 
some  mystic  meaning  other  than  may  be  found  in  the  entire 
train  and  system  of  experience  of  which  the  act  is  an  in- 
separable part.  No  progress  can  be  made,  however,  unless 
the  overt  act  be  observed,  and,  if  possible,  measured 
without  any  reference,  for  the  moment,  to  its  motive,  or 
its  rightness,  or  its  wrongness.  The  first  question  to  ask 
is,Y/hat  did  the  subject  do?  Until  this  question  is  answered 
in  quantitative  terms  so  that  what  he  did  is  clearly  known, 
there  is  little  use  in  going  on  to  ask  why  he  did  it,  and 
still  less  use  in  speculating  whether  he  is  to  be  blamed 
or  praised." 

One  of  the  great  problems  is  to  define,  .precisely  what 

one  wants  his  test  to  measure.  It  is  necessary  to  be  very 

1. 

specific.  According  to  Hartshorne  and  Hay,  The  term 
'specific’  is  used  a__s  descriptive  of  a cross  section  of 
individual  behavior.  It  represents,  however,  a useful 
theoretical  reconstruction  rather  than  direct  observation. 
To  be  sure,  observation  does  show  positive  correlations 

1.  Hartshorne  and  LI#y,  "Studies  In  Service  and  Self  Control 
pages  445  f 
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among  the  various  types  of  conduct  which  common  sense  would 
classify  as  honesty,  service  or  what  not;  hut  the  implications 
of  our  study  are  that  the  general  law  of  transfer  offers  an 
adequate  explanation  of  these  correlations ... .The  bridge 
between  these  two  observed  phenomena  - the  correlation  of 
two  sets  of  responses  and  the  identity  of  potent  elements 
in  two  situations  - is  the  theory  of  specificity,  which 
interprets  behavior  as  a function  of  the  situation.  " 

One  must  remember  then,  that  the  conduct  tests  of  the 
Inquiry  will  measure  only  what  the  subject  will  do  in  the 
particular  situation  in  which  he  is  bein^tested  . There 
seems  to  be  no  conduct  measurement  that  can  claim  more.  It 
is  going  still  farther,  however,  for  the  Inquiry  to  have  to 
admit  that  these  tests  as  they  are  used  on  groups  may 
measure  group  conduct  and  still  not  be  good  measures  of 
individuals*  To  quote  from  their  conclusions , 1'*'  Any 
attempt  to  summarize  such  complex  relationships  as  we  have 
been  dealing  with  is  bound  to  be  inadequate  and  unsatisfact- 
ory. Two  general  conclusions  are  emerging  which  account  for 
the  intricate  nature  of  our  findings.  The  first  is  that 
the  conduct  trends  and  their  relations  to  one  another 
in  individuals  are  the  precipitates  of  specific  exper- 
iences and  are  functions  of  the  situations  Jo  which 
they  have  become  attached  by  habit.  The  second  is  that 
these  specific  trends  and  relationships  are  gathered  into 


1.  Ibid  page  445 


— 


— 


— 


— 


.... 

- 


■■  < 1 

* 

. 


■ 

' 

. 

. 

. 

- 


pattei's  which  represent  not  general  ideas  about  conduct 

hut,  rather,  specific  group  tendencies.  That  is,  the  group 

with  its  accumulating  experience,  is  a common  factor  in  the 

situation  to  which  its  members  are  exposed.  " 

Some  of  the  techniques  used  by  the  Inquiry  have  already 

been  described.  Injsetting  up  their  techniques  they  tried  to 
as  possible 

satisfy  as  many^of  the  requirements  which  should  be  met 
by  tests  of  this  type  , and  so  formulated  ten  criteria. 

These  ten  criteria  are  here  quoted  for  they  reveal  some 

of  the  problems  with  which  they  were  faced  and  how  they 

1 

sought  to  meet  them; 

”1.  The  test  situation  should  be  as  far  as  possible 
a natural  situation.  It  should  also  be  a controlled  sit- 
uation. The  response  should  as  far  as  possible  be  natural 
even  when  directed. 

2.  The  test  situation  and  the  response  should  be  of 
such  a nature  as  to  allow  all  subjects'  equal  opportunity 
to  exhibit  the  behavior  which  is  being  tested.  That  is, 
there  should  be  nothing  about  the  test  it__self  which 
should  prevent  anyone  who  desired  to  achieve  from  so 
doing;  on  the  other  hand,  there  should  be  nothing  about 
it  to  trick  an  honest  person  into  an  act  he  would  re- 
pudiate if  he  were  aware  of  its  import. 

3.  Ho  test  should  subject  the  child  to  any  moral  strain 
beyond  that  to  which  he  is  subjected  in  the  natural  course 
of  his  actual  life  situations. 


1.  Hartshorne  and  May,  "Studies  in  Deceit"  pages  47-48 
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4.  The  test  should  not  put  the  subject  and  the  exam- 
iner in  false  social  relations  to  one  another.  The  exam- 
iner should  guard  against  being  deceptive  himself  in  order 
to  test  the  subject. 

5.  The  test  should  have  "low  visibility";  that  is,  it 
should  be  of  such  a nature  as  not  to  arouse  the  suspicions 
of  the  subject.  This  is  one  of  the  fundamental  difficulties 
in  all  such  testing  since  the  entire  purpose  of  the  test 
cannot  be  announced  in  advance.  This  criterion  is  all  the 
more  difficult  to  meet  when  coupled  with  criterion  number 
four,  for  the  examiner  must  keep  secret  one  aspect  of  his 
purpose  and  at  the  same  time  be  honest  with  his  subjects. 

6.  The  activity  demanded  of  the  subjedt  in  talking  the 
test  should  have  real  values  for  him  whether  he  is  aware 
of  these  values  or  not. 

7.  The  test  should  be  of  such  a nature  as  not  to  be 
spoiled  by  publicity. 

8.  If  tests  are  to  be  used  in  statistical  studies  they 
should  be  group  tests.  They  should  also  be  easy  to  administer 
and  should  be  mechanically  scored.  They  should  be  short 
ehough  to  give  in  single  school  periods. 

9.  The  test  results  should  be  clear  and  unambiguous. 

It  should  be  obirious  from  the  results  whether  the  subject 
did  or  did  not  exhibit  the  behavior  in  question.  The  evid- 
ence should  be  such  as  would  be  accepted  in  a court  of  law. 
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10.  The  scores  should  be  quantitative,  showing  the  amount 
as  well  as  the  fact  of  deception.  Each  test , therefore , should 
be  flexible  enough  to  include  within  its  scope  wide  ranges  of 
deceptive  tendency.  " 

These  requirements  were  quite  rigid  and  many  of  the 
techniques  tried  were  left  out  of  the  final  battery  because 

they  failed  to  meet  sufficient  of  the  requirements  to 
satisfy  the  standards  set  by  the  test  makers.  Trying  to 
meet  these  requirements  presents  many  problems.  In  the 
first  instance,  imagine  the  difficulty  in  trying  to  effect 
a natural  and  controlled  situation  at  the  same  time.  This 
means  the  utmost  care  in  devising  a test  situation  in  which 
it  is  possible  to  direct  a response  that  will'still  be 
natural.  Following  through  each  criterion  one  is  faced  with 
the  many  problems  which  are  raised  but  cannot  be  restated 
here.  The  three  volumes  named  which  have  been  put  out  by 
the  Inquiry  take  these  problems  up  in  detail. 

In  every  case  the  Inquiry  attempted  to  find  out  as  much 
about  the  subject  as  possible  apart  from  their  test  program 
in  order  to  have  some  device  for  checking  up  on  their 
tests.  Establishing  criteria  for  testing  or  validating  the 
test  raises  a big  problem  in  the  construction  of  tests  of 
conduct.  The  Inquiry  not  only  had  pupils  give  information 
about  themselves  but  they  were  also  rated  by  classmates 
and  teachers.  This  information  was  checker’  against  the 
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information  revealed  by  the  tests. 

Aside  from  the  devises  for  testing  conduct  like  those 
used  by  the  Inquiry,  there  is  some  attempt  to  measure  per- 
sonality traits  by  the  use  of  rating  scales,  as  was  suggested 
in  Section  III.  These  cannot  be  called  objective  measuring 
instruments,  however,  for  they  depend  upon  subjective  judge- 
ments. "While  they  may  be  helpful  in  finding  out  something 

about  a subject  they  do  not  measure  his  conduct  so  the 

« 

problems  of  construction  of  such  scales  will  hot  be  discussed 
here. 

There  are  some  problems  in  test  construction  which  are 

common  to  the  building  or  devising  and  using  of  all  kinds  of 

tests.  First  of  all,  the  test-maker  should  know  psychology  - 

know  it  theoretically,  practically  and  experimentally.  It 

is  necessary  to  know  the  characteristics  and  nature  of  the 

subject  being  tested,  whether  child,  adolescent  or  adult. 

A study  of  personality,  both  of  the  normal  and  abnormal 

responses  which  may  be  expected  and  interpreted  is  necessary. 

The  educational  principles  which  are  based  on  psychology 

must  be  kept  in  mind  and  testing  must  conform  to  them. 

The  second  problem  common  to  all  test  construction  is  a 

the  of  experiences 

knowledge  of  the  curricula  - both .background  and  subject 

matter,  material.  The  one  making  the  test  must  be  familiar 

with  the  field  of  instruction,  and  the  child’s  background 

of  experience  which  he  brings  to  this  learning  experience. 

A third  common  problem  is  to  construct  a test  that  will 

make  possible  a pupil’s  performance  in  terms  of  the  dimensions 
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1. 

or  characteristics  that  are  significant.  This  depends,  of 
course,  on  the  use  that  is  to  be  made  of  the  test.  Fourth, 
experimentation  in  test  construction  involves  the  use  of 
statistics.  The  validity  of  a test  is  determined  through  the 
correlation  with  a criterion,  and  the  reliability  of  the 
test  scores  is  measured  by  its  self-correlation,  the  Index 
of  reliability,  or  the  Standard  or  Probable  Error  of  measure- 
ment. The  interpretation  of  tests  including  further  reference 
to  the  statistical  problem  will  appear  later.  Other  common 
problems  have  been  mentioned  in  connection  with  the  differ- 
ent kinds  of  testing. 

In  summary  of  section  IV, it  is  well  to  remember  that 
while  most  Bible  teachers  will  not  be  chiefly  concerned  with 
building  or  constructing  tests,  yet  all  who  use  tests  or 
have  to  make  any  of  their  o wn  tests,  should  know  something 
about  the  problems  involved  in  their  construction.  Educat- 
ional literature  contains  much  material  on  this  topic,  but 
this  paper  only  attempts  to  enumerate  some  of  these  problems. 
They  have  been  briefly  classified  in  defining  what  one  is 
going  to  measure  and  determining  the  technique  which  will 
best  measure  that  thing;  getting  acquainted  with  the  proced- 
ures involved  in  building  tests  of  knowledge;  discovering 
the  nature  of  attitude  and  the  complexity  of  problems  in- 
volved in  attempting  to  construct  a measuring  scale; 
making  conduct  specific  enough  to  measure,  understanding 
the  common  elements  in  different  behavior  responses,  and 


1.  Monroe,  The  Theory  of  Educational  Measurements,  pages  86  ff 
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conforming  to  criteria  of  good  test  construction  such  as  set 
up  by  the  Character  Education  Inquiry.  The  problems  vary 
according  to  the  type  of  test  being  constructed  although 
there  are  some  problems  common  to  all. 

Section  V.  The  Problems  Involved  in  living  Tests  and  in  the 
Interpretation  and  Using  of  Test  Results. 

Ho  one  should  give  a test  just  because  it  is  the 
thing  to  do.  First  of  all,  and  back  of  all,  lies  the  reason 
for  the  test  and  the  purpose  in  giving  it.  This  purpose 
should  be  very  clearly  defined  in  the  test-giver's  mind  so 
that  he  can  choose  his  test  in  harmony  with  the  purpose,  and 
later  interpret  it  in  the  same  light.  When  the  purpose  is 
clearly  understood,  one  is  apt  to  choose  tests  which  reveal 
what  one  wants  to  know  and  tests  whose  results  may  be  inter- 

41^  IMLSL^ 

preted  in  the  light  of  that  purpose. 

The  motive  for  taking  the  test  should  be  so  reason- 
able and  desirable  that  the  fullest  cooperation  can  be  secured 
from  the  subjects.  Watson  says,’*’"  The  test  should  be  interest- 
ing. Most  of  the  tests  in  moral  and  religious  education  do 
seem  to  prove  interesting  to  people  who  take  them.  The  con- 
comitant attitudes  are  certainly  more  rewarding  for  a test 
of  this  sort."  Variety  and  test  elements  "closely  related 

to  the  major  emotional  centers  in  the  lives  of  the  people 

1 

who  take  them"  help  to  make  tests  interesting.  In  testing 
attitudes  and  conduct  it  is  often  difficult  to  make  a test 

1.  Watson,  G.B.  "Experimentation  and  Measurement  in  Relig- 
ious Education" 
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indirect  enough,  to  secure  the  subject’s  normal  reactions 
and  still  avoid  giving  a false  impression  as  to  why  the  test 
is  taken.  It  is  necessary  for  the  test  to  have  a good  effect 
upon  the  subject  both  at  the  time  he  is  taking  it  and  after 
he  has  taken  it.  The  value  of  the  test  should  also  be  con- 
sidered from  the  teacher’s  viewpoint,  and  on  others  who  may 
use  the  test  in  evaluating  progress.  Does  it  stimulate  better 
and  more  effective  effort,  or  does  it  cramp  the  teacher  by 
making  her  think  that  only  the  phases  which  the  tests 
measure  are  important  so  that  she  comes  to  teach  only  those 
things  which  she  thinks  will  help  her  pupils  to  score  high 
on  the  tests?  Or  it  may  be  that  unwise  supervisors  impose 
tests  that  are  superficial  or  too  difficult  so  that  teachers 
become  discouraged  or  else  disgusted  with  the  tests.  The 
tests  should  be  so  chosen  that  they  will  have  good  effects 
on  both  teachers  and  pupils. 

Another  problem  in  giving  tests  is  to  control  the 
conditions,  that  is,  make  the  conditions  the  same  for  all 
pupils  who  take  the  test.  In  standardized  tests  this  is 
necessary  in  order  to  compare  the  results  of  people  taking 
the  tests  in  different  times  and  different  places.  In 
ordinary  tests  it  is  helpful  to  eliminate  all  elements 
that  tend  to  influence  a pupil’s  performance  so  that  a 
truer  conception  of  his  ability  may  be  gained  from  the  test. 
Seating,  lighting,  noise,  giving  of  directions,  seatmates, 
class  mates  and  all  such  elements  tend  to  influence  a 
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pupil’s  performance,  ihe  pupil  should  he  given  the  advantage 
of  the  best  conditions  possible. 

When  the  test  has  been  given,  the  interpretation  of 
test  results  will  be  found  to  be  very  important.  First,  there 
is  the  scoring  of  the  test.  Depending  on  the  type  of  test, 
and  the  directions  for  giving  it,  one  must  decide  whether 
he  will  correct  for  chance.  Chance  may  mean  guessing  on  a 
true-false  test  when  the  answer  is  not  known.  Here  there  is 
a fifty-fifty  chance  that  the  subject  will  guess  either  the 
right  or  the  wrong  answer.  In  a conduct  test,  chance  may 
mean  accidently  making  the  right  response  when  the  subject 
was  not  consciously  controling  his  conduct  in  line  with  a 
right  motive.  The  scoring  scheme  is  usually  determined  by 
the  test  maker  and  given  in  the  directions,  as  the  test 
maker's  business  is  to  find  out  the  best  way  of  scoring 
before  he  releases  his  test. 

Usually  the  manner  of  weighting  the  exercises  has  been 
given  in  the  instructions  for  giving  and  scoring  the  test. 

If  questions  have  been  unequally  weighted,  one  must  know 
whai:  should  be  the  basis  for  determining  the  weights  assign- 
ed -chera.  iinowledge  of  this  basis  will  help  in  interpreting 
the  results,  as  well  as  in  scoring  -&he  test.  If  an  examiner’s 
purpose  varies  somewhat  from  the  test-maker’s,  he  may  want 
to  reevaluate  and  weight  his  scores  accordingly.  All  of 
this  must  be  considered  in  the  sooring. 

When  the  tests  ha-fed  been  scored,  the  data  musL  be  so 
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tabulated  that  1&  can  be  used  for  the  fullest  interpretation. 
It  is  usually  desirable  to  have  i z in  such  form  that  both 
individual  and  group  scores  may  be  referred  to  and  considered. 
Tables  showing  other  relationships  such  as  name, age,  grade 
and  scores  ought  to  be  made  in  order  to  make  comparisons 
and  help  to  interpret  the  results.  Tables,  charts,  and 
graphs  are  highly  desirable  in  some  cases.  Rules  for 
constructing  such  tables,  charts  and  graphs  are  to  be  found 
in  books  dealing  with  statistical  treatment  of  tests  and 
test  results,  but  some  of  the  most  common  are  enumerated 
here  for  illustration  of  what  is  meant:  Tables  showing 
relationships  - in  reality  correlation  tables  - of  ages 
and  scores,  of  public  school  grades  and  scores,  etc.  , 
tables  showing  the  performance  of  each  pupil  on  the  test, 
with  their  total  scores,  and  also  the  number  of  people  who 
answered  each  question  right,  the  number  wrong  and  the 
number  omitting  it  altogether;  frequency  tables  showing 
group  scores  are  almost  always  necessary.  Graphs  are  helpful 
in  pres'enting  this  data  to  others  as  it  reveals  the  data  in 
picture  form  so  that  it  may  be  easily  seen  as  a v/hole. 

Graphs  giving  a picture  of  the  distribution  of  the  pupils' 
scores  help  anyone  to  see  and  analyze  this  group  to  some 
extent  in  a brief  moment  of  time.  In  the  interpretation, 
extreme  scores,  a large  range,  or  a large  variability 
should  always  be  accounted  for  if  possible. 

Besides  the  tabulation  of  data  in  useable  form  there 
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is  a statistical  treatment  of  this  data  which  is  necessary  for 
it3  interpretation.  The  Frequency  table  has  been  mentioned. 

It  is  used  as  a basis  from  which  is  found  these  other  useful 
measures,  namely  central  tendence,  and  variability.  ’’The  value 
of  a measure  of  central  tendency  is  two-fold:  in  the  first 
place,  it  is  a single  measure  which  represents  all  of  the 
scores  made  by  the  group,  and  as  such  gives  a concise  de- 
scription of  the  performance  of  the  group  as  a whole;  secondly, 
it  enables  us  to  compare  two  or  more  groups  in  terms  of  typ- 
ical performance.  There  are  three  measures  of  central  tend- 
ency in  common  use,  (1)  the  average  or  arithmetic  mean, 

(2)  the  median, (3)  and  the  mode. ••The  nest  step  is  the  cal- 
culation of  the  variability  of  the  scores,  that  is,  of  the 
’scatter1  or  ’spread’  of  the  separate  scores  or  measures 
around  their  measure  of  central  tendency.”  The  measure  of 
variability  is  useful  in  discovering  how  much  territory 
is  covered  by  any  one  group  and  will  show  the  homogeneity 
of  the  individuals  within  a group  and  whether  they  are  of 
nearly  the  same  ability  or  of  widely  differing  abilities. 

The  measures  which  show  variability  are  the  range,  the 
quart ile  deviation,  and  the  standard  deviation.  The 
standard  deviation  is  the  most  often  used  and  most  accurate, 
and  it  is  necessary  when  correlations  and  measures  of  re- 
liability are  later  to  be  computed.  Statistical  treatment 
is  also  necessary  to  show  the  reliability  of  a measure. 

1.  Garrett,  H.E.  "Statistics  In  Psychology  and  Education" 
pages  8 and  16 
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One  wants  to  know  to  what  extent  his  measure  is  a true  meas- 
ure of  an  individual’s  capacity  or  ability  or  achievement, 
and  the  measures  of  reliability  give  us  the  amount  by  which 
an  obtained  measure  "most  probably”  varies  from  its  corres- 
ponding true  measure.  Reliability  measures  may  also  be  found 
1. 

for  groups. 

We  have  already  indicated  that  statistics  are 
necessary  in  weighting  questions  or  exercises,  in  deter- 
mining the  validity  of  test  scores  and  it  is  useful  in 
combining  the  scores  from  different  tests.  The  statistical 
problem,  both  in  construction  of  tests  and  in  the  inter- 
pretation of  test  results  is  a necessary  one  from  the  stand- 
point of  research  and  experimentation.  In  order  to  show 
relationships  and  make  predictions,  the  statistical  treat- 
ment known  as  correlation  is  of  great  importance.  To 
illustrate,  a group’s  average  score  on  an  ethical  judgement 
test  may  be  compared  with  its  age,  with  its  length  of  attend- 
ence  in  the  church  school  and  with  its  public  school  grade. 

By  such  comparison  it  may  be  possible  to  ascertain  which  of 
these  factors  exert  the  most  influence  on  his  ethical 
judgement  score  or  to  what  extent  each  of  them  affect  this 
ability.  Then  knowing  the  ability  of  an  individual  in  one 
test,  which  is  highly  correlated  with  another  test,  ought 
to  give  us  some  indication  of  his  ability  in  this  other  test. 
Many  of  our  methods  in  teaching  can  be  improved  when  we  know 
to  what  extent  certain  methods  rather  than  other  methods. 


1.  Ibid,  Chapter  III 
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correlate  highest  with  right  attitudes,  or  control  of  conduct 
according  to  the  ideals  of  Jesus.  Correlations  help  to  make 
these  things  possible. 

One  of  the  chief  influences  or  relationships  to  be 
considered  in  any  testing  procedure  and  interpretation  is 
the  influence  of  intelligence  on  the  pupil’s  rating  in  the 
test.  Usually  correlations  are  made  with  intelligence  when 
a test  is  in  the  experimental  stage  for  it  is  necessary  to 
know  how  much  there  is  in  common  between  intelligence  and  the 
ability  being  measured. 

Another  problem  in  interpretation  closely  associated 
with  the  preceding  one  is  to  determine  to  wfrat  extent  the 
subject  would  by  nature  achieve  a given  amount  of  the  ability 
and  to  what  extent  he  has  achieved  this  ability  because  of 
the  instruction  or  training  given  as  the  result  of  Bible 
teaching.  Statistics  may  help  to  solve  this  problem  but  the 
most  important  interpretation  of  test  results  in  any  case 
is  a common  sense  one  based  on  one’s  knowledge  of  the 
subjects  and  their  relation  to  others,  and  on  one’s  ability 
to  reason  and  see  relationships  which  make  the  subject’s  re- 
actions understandable.  A common  sense  attitude  and  general 
background  of  knowledge  and  experience  is  of  more  benefit 
in  interpretation  than  all  the  statistics  without  these 
qualities. 

Some  people  who  test  and  interpret  their  results, 


still  fail  to  get  results  from  their  testings  because  they 
do  not  £se  this  information  gleaned  from  testing  and  the  in- 
terpretation of  results  to  remedy  their  faulty  situations. 
Suggestions  for  remedial  treatment  should  be  worked  out  from 
the  interpretations  of  the  test  results  and  from  general 
knowledge  and  the  common  sense  of  the  administrator,  super- 
visor or  teacher  giving  the  test.  To  be  able  to  suggest 
remedies  assumes  wide  experience  in  testing  and  in  teaching. 

In  presenting  recommendations,  one  should  consider,  (a)  the 

and  individual 

objectives  of  the  school  for  this  particular  group^,(bj  the 
character  of  the  curriculum  material  for  attaining  these 

objectives,  and  if  necessary  (c)  the  teaching  methods  which 

1. 

will  help  attain  these  objectives.  One  illustration 

may  suffice*  in  testing  a Church  School  Intermediate  Depart- 
ment on  Biblical  Information,  using  the  Whitley  Biblical 
knowledge  Test, (New  Testament,  Form  A 4 3 ),the  test  was 
analyzed  to  determine  whether  the  pupil  knew  more  about  the 
life  of  Jesus,  or  the  life  of  Paul;  whether  he  was  strong 
in  his  knowledge  of  the  facts  about  Jesus’  life,  and  yet  weak 
when  it  came  to  knowing  what  Jesus  taught.  Take  for  example, 
pupil  A;  there  was  a possible  score  of  fifteen  on  the  bio- 
graphical facts  of  the  life  of  Jesus,  and  her  score  was  one. 
She  had  no  knowledge  of  the  life  of  Paul  or  other  characters 
in  the  New  Testament.  With  such  facts  revealed  a teacher  may 
analyze  her  teaching  procedure  with  this  child  something  as 

1*  Eanson,  W.L.  Supervision  of  Religious  Education,  Class 
Lecture  Hotels. 
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follows:  this  girl  is  an  intermediate  pupil  and  her  know- 
ledge of  the  Bible  is  almost  all  in  terms  of  abstract  princi- 
ples. She  is  of  the  age  where  she  is  going  to  need  the  emot?- 
ional  and  idealistic  elements  which  are  found  in  personalities 
to  act  as  a stimulant  to  her  loyalties.  I should,  therefore, 
help  her  to  become  acquainted  with  Jesus  as  a personality, 
and  with  other  personalities  so  that  her  loyalty  and  attitude 
will  be  strengthened  sufficiently  to  steady  her  in  time  of 
need.  Such  an  analysis  might  be  carried  on,  not  only  for 
this  pupil,  but  also  for  each  pupil  until  the  teacher’s  task 
is  determined  for  her  by  the  need  of  her  pupils. 

To  summarize  section  V,  the  problems  involved  in 
giving  tests  and  in  interpreting  test  results  may  be  divided 
into  three  classes:  (1)  the  problems  of  giving  a test,  such 
as  determining  the  purpose  and  choosing  the  test  in  harmony 
with  this  purpose,  motivating  the  test  both  for  teachers  and 
pupils,  and  controling  the  conditions  under  which  the  test 
is  given;  (2)  the  problems  of  interpreting  the  test  which 
involve  scoring,  tabulation  of  data,  statistical  treatment  of 
data,  and  a common  sense  interpretation;  (3)  the  problems 
in  suggesting  recommendations  for  remedial  treatment.  All 
who  use  tests  are  faced  with  these  problems  and  should  be- 
come familiar  with  literature  which  helps  them  to  use  tests 
effectively. 
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VI*  Conclusions  as  to  the  General  Trend  and  Future  Outlook 
For  the  Use  of  Tests  and  MeasurBents  in  Measuring  the 
Outcomes  of  Bible  Teaching* 

It  is  difficult  to  limit  a treatment  of  the  prolems 
involved  in  measuring  the  outcomes  of  Bible  Teaching  because 
there  are  so  many  problems  and  there  is  very  little  general 
acquaintance  with  them*  The  whole  subject  is  rather  new  in 
its  scientific  aspect.  To  some  extent,  people  have  always 
tried  to  measure  the  results  of  their  efforts.  Sven  the 
Bible  teacher  who  says  of  her  pupils, "John  has  become  a 
better  boy  than  Dick"  i3  measuring  or  judging  the  results 
of  their  character  education.  It  has  become  necessary  in 
Religious  Education  as  in  other  fields,  however,  to  make 
this  measurement  more  accurate  and  objective. 

One  of  the  first  problems  involved  is  fundamental 
for  it  asks.  Shall  we  retain  the  Bible  in  the  curriculum 
of  Religious  Education?  The  present  trend  in  Religious 
Education  is  to  submit  all  curriculum  material  to  the 
pragmatic  test,  and  ask, What  does  it  do?.  As  Bible  teaching 
is  found  to  have  desirable  results  in  the  production  of 
Christian  character  so  it  will  be  used,  or  otherwise  elim- 
inated. A comparison  of  present  day  text-books  used  even 

five  years  ago  shows  a decided  decrease  in  the  amount  of 

the  method  of  its  use  and  th§  emphasis  given  itJ  _ 

Bible  material^  and  a decided  increase  m definiteness  of 

purpose  \>tthe  amount  that  is  used.  This  is  an  indication 
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of  the  trend  toward  the  defining  and  determining  of  goals 
or  objectives  to  cover  a progressive,  moral,  and  religious 
training  in  the  lives  of  boys  and  girls.  At  present  these 
objectives  are  such  that  a large  amount  of  Bible  material 
is  still  included  in  the  course  of  study  in  the  Protestant 
churches.  Its  inclusion  in  the  future  will  depend  on  the 
extent  to  which  Bible  teaching  results  are  measured  and  found 
to  have  significance  in  producing  Christian  life  and  conduct. 
So  little  of  such  measurement  ha.s  been  done  that  it  is  diff- 
icult to  forecast  the  future  from  these  limited  results,  but 
so  far  as  statements  of  objectives  can  reveal  - for  instance, 
those  stated  by  the  International  Council  of  Religious  Ed- 
ucation - Bible  teaching  is  essential  in  the  building  of 
Christian  character. 

Again  because  of  the  newness  of  objective  measure- 
ments in  xeligious  .education,  the  problem  of  convincing 
x^ligious  educators  and  administrators  of  the  need  for 
measurement  seems  paramount  at  present.  Public  education 
faced  this  same  problem,  and  very  soon,  because  the  need 
was  so  apparent,  there  was  a great  enthusiasm  for  objective 
tests.  Enthusiasm  exceeded  caution  and  careful  work  so 
that  teachers  and  educators  soon  became  disgusted  with  the 
superficial  testing  being  done.  Bow  the  pendulum  has  swung 
back  in  the  other  direction.  If  xeligious  education  profits 
by  this  experience,  they  will  proceed  carefully  in  the 
construction  of  worthwhile  tests  that  are  vital  to  their 
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need  rather  than  rushing  to  extremes  putting  out  tests  that 
produce  undesirable  reactions  or  claiming  much  for  tests 
that  are  really  unreliable.  One  can  hardly  study  carefully 
into  the  present  status  of  personality  tests  and  say  with 
assurance  that  they  measure  what  they  claim  to  measure  at 
all.  Indeed,  we  even  become  skeptical  enough  to  claim 
that  one  cannot  measure  personality  traits  with  the  present 
measuring  instruments  and  that  we  have  not  yet  found  the 
kind  of  instruments  that  will  do  so.  Perhaps  the  view  of 
either  extreme  is  undesirable  and  with  a perfecting  of  our 
present  procedures  the  most  cautious  will  gain  more 
Confidence.  This  especially  applies  to  group  testing. 

Already  fcome  of  the  dangers  of  a hasty  enthusiasm 
hawebeen  discovered.  The  information  or  knowledge  tests 
have  become  readily  popular.  Examination  of  text-books  in 
our  church  schools  reveal  the  use  of  the  more  common  tech- 
niques, such  as  True-false  tests  in  the  review  lessons. 

The  reactions  reveal  that  some  of  these  tests  are  carelessly 
constructed  with  over-emphasis  on  minor  details  and 
minor  facts  while  omitting  the  importance  of  great  re- 
ligious ideas.  Furthermore,  the  teachers  are  not  given 
instructions  as  to  how  to  interpret  these  tests.  It  may 
be  because  they  are  too  carelessly  constructed  to  have 
significant  interpretation.  Another  caution  that  has 
been  found  necessary  is  to  realize  that  a pupil’s  advance 
is  not  all  due  to  the  course  being  studied,  but  partly  to 
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his  general  environment.  There  are  many  elements  entering 
into  religious  and  moral  growth  besides  the  outgrowths  of 
a course  of  study.  The  Inquiry  found  that  the  home,  and 
playmates,  were  probably  the  greatest  contributing  factors 
to  the  growth  of  ideas. 

One  of  the  most  evident:  conclusions  of  this  study 
ought  to  be  that  the  whole  problem  of  testing  is  far  from 
being  a simple  thing.  Many  people  hearing  about  the  need 
for  tests  and  measurements  of  progress  think  thew  can  very 
very  simply  meet  this  need  by  the  use  of  a simple  test  or 
two  a nd  then  go  blissfully  on  their  way.  Only  one 
instance  of  the  complexity  of  this  problem  is  needed  to 
prove  the  point.  In  measuring  that  part  of  conduct  or 
character  called  service  traits,  the  Inquiry  tried  out  a 

number  of  techniques  and  developed  and  used  five  of  them. 

1. 

Then  comes  this  statement,  "In  order  to  measure  completely 
the  sort  of  behavior  included  in  these  tests  (service), 
there  would  be  needed  31  more  like  them. . . .We  have 
once  more  demonstrated  that  a battery  which  might  properly 
claim  to  measure  any  sL  ngle  type  of  conduct  must  include 
a large  number  of  tests  representing  a variety  of  sit- 
uations" . 

Since  the  denominational  publishers  are  getting  out 
tests  and  studying  the  testing  problem, and  since  The  Inter- 
national Council  of  Religious  Education  has  authorised  a 


1.  Hartshorne  and  May,  Studies  in  Service  a.nd  Self-Control 
pages  264-265 
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committee  to  study  the  need  for  standardizing  tests (to  report 
the  findings^  bade  to  the  Educational  Commission)  , the  outl 
look  for  the  future  is  a more  careful  and  helpful  testing 
procedure.  The  Psychology  and  Education  departments  of 
our  Universities  are  working  with  the  problem  of  measuring 
the  dynamic  factors.  Since  attitudes  are  the  essential  motives 
of  conduct,  the  extent  to  which  valid  and  reliable  tests  of 
desirable  attitudes  are  made,  to  that  extent  will  the  field 
prove  profitable. 

Surely  this  study  has  Revealed  that  while  there  are 
many  problems  still  unsolved,  yet  there  is  a decided  inter- 
est in  these  problems  and  progress  is  being  made  in  their 
solution.  We  have  learned,  too,  that  progress  is  made 
through  measurement.  One  needs  to  work  toward  a goal  and 
stop  often  to  check  up  with  his  goal  to  see  if  he  is  still 
working  in  the  right  direction.  So  in  religious  education 
there  must  be  constant  checking  up  with  objectives  and 
constant  going  ahead.  Since  this  is  progress  and  tests  are 
a means  of  going  ahead  because  they  check  up,  there  will 
be  a growing  refinement  and  application  of  measurement 
to  the  results  of  all  religious  education  including  Bible 
teaching  which  only  the  future  can  reveal. 
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